Member since
10-17-2016
93
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3005 | 09-28-2017 04:38 PM | |
4849 | 08-24-2017 06:12 PM | |
1090 | 07-03-2017 12:20 PM |
12-03-2017
02:18 PM
@Ashutosh Mestry any thoughts?
... View more
11-29-2017
12:06 PM
Atlas is a governance tool. Two of the key pillars of data governance are accountability and meeting compliance requirements. To establish accountability and traceability, tools usually support lineage information. This helps answering questions like where did the data come from, who modified it and how was it modified etc. Compliance requirements for industries like healthcare and the finance industry can be very strict. Origins of the data are required to be known with any ambiguity. Since Atlas claims to help organizations meet their compliance requirements, consider the scenario presented in the attached figure. lineage-accountability.png In the figure we notice a process reads a few data items and then writes them to two different Databases. Atlas can capture cross component lineage and will capture the inputs and the outputs of the process. How can we determine what input went to what database? There can be a situation where all records from data item 1 are written to database two and the remaining two data items are written to database 1. In such a case, I have ambiguity in the lineage. All I would know is that the data could be from any of the data sources. Will such information be enough to meet compliance requirements? The second question I have is regarding performance. Currently Kafka does not support Atlas V2. Therefore when developing the Spark Atlas addon, I used the RESP API to post the entities. Since I am also handling Spark Streaming, in such a case the number of entity notifications can be high. Can I run into scalability issues in such a scenario? Approximately what rate can the REST API handle before messages are dropped? Thanks in advance
... View more
Labels:
- Labels:
-
Apache Atlas
11-09-2017
07:04 PM
Hi @Ashutosh Mestry my first question is why doesnt atlas show the entire lineage in one go. Have a look at the attached pictures. They represent a single chain. Notice that the first linage ends at rdd 3 and then I have to open rdd 3 to see what happened further. Can it not display the entire chain at once? what determines how much of the chain will be shown from a given entity ? screenshot-from-2017-11-09-19-58-36.png screenshot-from-2017-11-09-19-59-42.png screenshot-from-2017-11-09-20-00-08.png
... View more
11-04-2017
06:44 PM
What determines how much lineage will be displayed? I have huge lineage diagrams but it seems atlas randomly chooses to show parts of the tree at different points. Should it not show the entire linage tree if i am at the root data set ? Also Atlas seems to get stuck when I have a linage diagram that consists of 200+ entities. I see the loading wheel forever. thanks
... View more
Labels:
- Labels:
-
Apache Atlas
10-23-2017
07:13 PM
Hi, I have this scenario where after reading json files I'm doing InvokeHttp against a url attribute in each json file. This returns a further list of json objects with url attribute which I later split and do InvokeHttp individually against each url for result. Now the problem is at the end I need to have a composed json against each flow resulting from the inital json that I read from file along with the later json objects and the final result received after hitting individual url. I need to save this composed json as record in mongodb. I 'm having troubling making this json, so need help regarding the flow and processors. Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
10-12-2017
11:48 PM
1 Kudo
Hi I have the following type defined in atlas. Notice that it extends both Dataset and Process. You can use this URL to post this entity using Postman: http://localhost:21000/api/atlas/v2/types/typedefs {
"stuctDefs":[],
"classificationDefs": [],
"entityDefs" : [ {
"name": "spark_testStage",
"superTypes" : ["Process",
"DataSet"],
"attributeDefs" : [
{
"name" : "test",
"typeName": "string",
"isOptional" : true,
"cardinality": "SINGLE",
"isIndexable": false,
"isUnique": false
},
{
"name": "description",
"typeName": "string",
"cardinality": "SINGLE",
"isIndexable": true,
"isOptional": true,
"isUnique": false
} ]
}]
}
Now i create the following entities using this URL http://localhost:21000/api/atlas/v2/entity {
"referredEntities":
{
"-208942807557405": {
"typeName": "spark_testStage",
"attributes": {
"owner": "spark",
"qualifiedName": "Stage6@clusterName",
"name": "Stage6",
"description": "this is attribute is inclued due to inheritance"
},
"guid": "-208942807557405",
"version": 0,
"inputs":
[
{
"typeName": "spark_testStage",
"attributes":
{
"source": "testing",
"description": null,
"qualifiedName": "Stage5@clusterName",
"name": "Stage5",
"owner": null,
"destination": "hdfs://vimal-fenton-4-1.openstacklocal:80ion"
},
"guid": "-208942807557404"
}
],
"classifications": []
}
},
"entity":
{
"guid": "-208942807557404",
"status": "ACTIVE",
"version": 0,
"typeName": "spark_testStage",
"attributes" :
{
"qualifiedName" : "Stage5@clusterName",
"name" : "Stage5",
"test" : "this is source",
"description" : "source",
"outputs":
[
{
"typeName": "spark_testStage",
"attributes":
{
"source": "testing",
"description": null,
"qualifiedName": "Stage6@clusterName",
"name": "Stage6",
"owner": null,
"destination": "hdfs://vimal-fenton-4-1.openstacklocal:8020/apps/hive/warehouse/destination"
},
"guid": "-208942807557405"
}
]
},
"classifications": []
}
}
Why doesnt atlas define a line between the two entities. The request is successful but i dont see any lineage. Also I see that stage5 has stage 6 in its output and is correctly linked. I can click on this link on to stage 6 but stage 6 does not have stage 5 as its input.
... View more
Labels:
- Labels:
-
Apache Atlas
10-10-2017
02:45 PM
@anaik @Ashutosh Mestry @Sarath Subramanian @Vadim Vaks any suggestions?
... View more
10-10-2017
02:33 PM
I was wondering how is cross component lineage handled in Atlas. I do see the example where data is loaded form Kafka to Storm and then into HDFS. The question is, is it specifically mentioned somewhere in the code that Storm read from what Kafka topic and wrote to which HDFS directory or is it somehow handled dynamically via some event notifications.For example if I read data from Kafka into Spark, Will i have to specify this information in the code for the Spark Application, thus changing the application? The tutorial is at Cross Component Scripts. This tutorial makes use of a JAR file. Is it possible to get the source code or can someone point me to the code where this information is handled. Spark has jobs, stages, tasks. We can model and send this information to Atlas via the hook to capture the details on what goes on inside Spark. What about Spark Streaming? Spark streaming has the same structure, only that the jobs are repeated every batch interval. Since streaming applications are long running, sending this much detailed information would make little sense as it might overwhelm the system and it seems it would be redundant information. Any suggestions on how should streaming info to Atlas be handled and what info should be sent? Thanks
... View more
Labels:
- Labels:
-
Apache Atlas
10-05-2017
05:01 PM
Hi @Ashutosh Mestry I was using the json you provided in the Zip file. I tried the other file which works fine. The zip file is probably the response once the entities are created. Thanks
... View more
10-04-2017
10:06 AM
thank you @Ashutosh Mestry for such a detailed response. Helps alot! Unfortunately the hive entities are not created. I copied your hive table entities JSON in POSTMAN and i get the following error: {
"errorCode": "ATLAS-404-00-00A",
"errorMessage": "Referenced entity 027a987e-867a-4c98-ac1e-c5ded41130d3 is not found"
}
<br>
... View more
10-03-2017
06:47 PM
Going through the Atlas Technical user guide I see the attribute isComposite, which I do not see in any of the new model definitions for V2. Is this attribute still valid? how can this be specified in v2? I also see that constraints are defined in the Hive mode such as : ownedRef, mappedFromRef,foriegnKey, OnDelete. Where can I get the complete list of constraints and their meaning. What is the relationship for? I see that Hive table refers to array of columns as its attributes and vice versa. and then there is a relationship also defined called hive_table_columns. what additional benefit does creating the relation provide? How to create entities when handling compositions? If the table is composed of columns, will i define columns within the table creation request? An example creation of hive table and column (entities for V2) would be of great help. Thanks @Ashutosh Mestry
... View more
Labels:
- Labels:
-
Apache Atlas
09-28-2017
04:38 PM
thanks for the reply. @anaik Natively means that I have built and installed Atlas directly from the website on my laptop. @Ashutosh Mestry I am writing a hook for spark. Spark already has a very good notification mechanism via the event listeners. I would want to capture these events real time to send data to Atlas. Since I want to write a hook so I am not going to be focusing on the REST API. I thought it would be a good idea to have a look at the storm bridge and run the test to better understand how can something similar be achieved for Spark. I am also using intellij for the development. I imported the Strom hook project (located at atlas_home/addons/storm_bridge) into intelliJ and tried running the test case it comes with. I get the metadata timeout error from kafka. Atlas is starts successfully. The problem arises when sending a notification to kafka topics. I was able to solve the issue. It turns out I needed to add the atlas.properties file to the project. One last thing: I see that whenever atlas starts it does a type init : 2017-09-27 00:09:26,564 INFO - [main:] ~ ==> AtlasTypeDefStoreInitializer.loadBootstrapTypeDefs() (AtlasTypeDefStoreInitializer:109)
2017-09-27 00:09:26,572 INFO - [main:] ~ ==> AtlasTypeDefStoreInitializer(/usr/hdp/2.6.2.0-162/atlas/models) (AtlasTypeDefStoreInitializer:146)
2017-09-27 00:09:26,602 INFO - [main:] ~ No type in file /usr/hdp/2.6.2.0-162/atlas/models/-worker.log.0.current (AtlasTypeDefStoreInitializer:165)
2017-09-27 00:09:26,665 INFO - [main:] ~ No new type in file /usr/hdp/2.6.2.0-162/atlas/models/0010-base_model.json (AtlasTypeDefStoreInitializer:178)
2017-09-27 00:09:26,669 INFO - [main:] ~ No new type in file /usr/hdp/2.6.2.0-162/atlas/models/0020-fs_model.json (AtlasTypeDefStoreInitializer:178)
2017-09-27 00:09:26,673 INFO - [main:] ~ No new type in file /usr/hdp/2.6.2.0-162/atlas/models/0030-hive_model.json (AtlasTypeDefStoreInitializer:178)
So I can simply add my Spark Types here at the same location and they will also be automatically loaded at startup. In my case the location is: /home/arsalan/Development/atlas/distro/target/apache-atlas-0.9-SNAPSHOT-bin/apache-atlas-0.9-SNAPSHOT/models/ However I notice a bit difference in the structure, say for example if we have a look at the 1080-storm_model.json file it defines the types under "entityDefs". As far as i know from the Type system the types should have been defined under the ClassType tag like shown in this tutorial: Atlas Custom Type Creation File located at : Type Defination Json Github Link Any idea why the discrepancy?
... View more
09-27-2017
07:28 PM
@anaik @Jay SenSharma @Ashutosh Mestry @Vadim Vaks
... View more
09-27-2017
07:22 PM
Hi I have installed atlas natively on ubuntu (not in HDP). I want to write a hook to import data to Atlas. I am also using Berkely DB for the storage. I am facing the following error: 189115 [Thread-29-globalCount-executor[6 6]] INFO o.a.s.d.executor - Processing received message FOR 6 TUPLE: source: count:4, stream: default, id: {}, [bertels, 370]189115 [Thread-29-globalCount-executor[6 6]] INFO o.a.s.d.task - Emitting: globalCount default [2635]189115 [Thread-29-globalCount-executor[6 6]] INFO o.a.s.d.executor - BOLT ack TASK: 6 TIME: TUPLE: source: count:4, stream: default, id: {}, [bertels, 370]189115 [Thread-29-globalCount-executor[6 6]] INFO o.a.s.d.executor - Execute done TUPLE source: count:4, stream: default, id: {}, [bertels, 370] TASK: 6 DELTA: 2017-09-12 16:58:49,456 ERROR - [main:] ~ {"version":{"version":"1.0.0"},"message":{"entities":[{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference","id":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id","id":"-88302553345504","version":0,"typeName":"storm_topology","state":"ACTIVE"},"typeName":"storm_topology","values":{"name":"word-count","startTime":"2017-09-12T14:55:46.903Z","outputs":[],"id":"word-count-1-1505228146","inputs":[],"qualifiedName":"word-count","nodes":[{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference","id":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id","id":"-88302553345502","version":0,"typeName":"storm_bolt","state":"ACTIVE"},"typeName":"storm_bolt","values":{"name":"globalCount","driverClass":"org.apache.storm.testing.TestGlobalCount","conf":{"TestGlobalCount._count":"0"},"inputs":["count"]},"traitNames":[],"traits":{},"systemAttributes":{}},{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference","id":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id","id":"-88302553345503","version":0,"typeName":"storm_spout","state":"ACTIVE"},"typeName":"storm_spout","values":{"outputs":["count"],"name":"words","driverClass":"org.apache.storm.testing.TestWordSpout","conf":{"TestWordSpout._isDistributed":"true"}},"traitNames":[],"traits":{},"systemAttributes":{}},{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference","id":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id","id":"-88302553345501","version":0,"typeName":"storm_bolt","state":"ACTIVE"},"typeName":"storm_bolt","values":{"name":"count","outputs":["globalCount"],"driverClass":"org.apache.storm.topology.BasicBoltExecutor","conf":{},"inputs":["words"]},"traitNames":[],"traits":{},"systemAttributes":{}}],"owner":"arsalan","clusterName":"primary"},"traitNames":[],"traits":{},"systemAttributes":{}}],"type":"ENTITY_CREATE","user":"arsalan"}} (FailedMessagesLogger:95)189129 [main] ERROR o.a.a.h.AtlasHook - Failed to notify atlas for entity [[{Id='(type: storm_topology, id: <unassigned>)', traits=[], values={outputs=[], owner=arsalan, nodes=[{Id='(type: storm_bolt, id: <unassigned>)', traits=[], values={name=globalCount, conf={TestGlobalCount._count=0}, driverClass=org.apache.storm.testing.TestGlobalCount, inputs=[count]}}, {Id='(type: storm_spout, id: <unassigned>)', traits=[], values={name=words, outputs=[count], conf={TestWordSpout._isDistributed=true}, driverClass=org.apache.storm.testing.TestWordSpout}}, {Id='(type: storm_bolt, id: <unassigned>)', traits=[], values={name=count, outputs=[globalCount], conf={}, driverClass=org.apache.storm.topology.BasicBoltExecutor, inputs=[words]}}], inputs=[], qualifiedName=word-count, clusterName=primary, name=word-count, startTime=2017-09-12T14:55:46.903Z, id=word-count-1-1505228146}}]] after 3 retries. Quittingorg.apache.atlas.notification.NotificationException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms. at org.apache.atlas.kafka.KafkaNotification.sendInternalToProducer(KafkaNotification.java:236) ~[atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.kafka.KafkaNotification.sendInternal(KafkaNotification.java:209) ~[atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.notification.AbstractNotification.send(AbstractNotification.java:84) ~[atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.hook.AtlasHook.notifyEntitiesInternal(AtlasHook.java:133) [atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.hook.AtlasHook.notifyEntities(AtlasHook.java:118) [atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.hook.AtlasHook.notifyEntities(AtlasHook.java:171) [atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.hook.AtlasHook.notifyEntities(AtlasHook.java:105) [atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] at org.apache.atlas.storm.hook.StormAtlasHook.notify(StormAtlasHook.java:102) [classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) [clojure-1.7.0.jar:?] at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) [clojure-1.7.0.jar:?] at org.apache.storm.LocalCluster$submit_hook.invoke(LocalCluster.clj:45) [storm-core-1.0.0.jar:1.0.0] at org.apache.storm.LocalCluster$_submitTopology.invoke(LocalCluster.clj:52) [storm-core-1.0.0.jar:1.0.0] at org.apache.storm.LocalCluster.submitTopology(Unknown Source) [storm-core-1.0.0.jar:1.0.0] at org.apache.atlas.storm.hook.StormTestUtil.submitTopology(StormTestUtil.java:67) [test-classes/:?] at org.apache.atlas.storm.hook.StormAtlasHookIT.testAddEntities(StormAtlasHookIT.java:75) [test-classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_131] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131] at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80) [testng-6.1.1.jar:?] at org.testng.internal.Invoker.invokeMethod(Invoker.java:673) [testng-6.1.1.jar:?] at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:842) [testng-6.1.1.jar:?] at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1166) [testng-6.1.1.jar:?] at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125) [testng-6.1.1.jar:?] at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109) [testng-6.1.1.jar:?] at org.testng.TestRunner.runWorkers(TestRunner.java:1178) [testng-6.1.1.jar:?] at org.testng.TestRunner.privateRun(TestRunner.java:757) [testng-6.1.1.jar:?] at org.testng.TestRunner.run(TestRunner.java:608) [testng-6.1.1.jar:?] at org.testng.SuiteRunner.runTest(SuiteRunner.java:334) [testng-6.1.1.jar:?] at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329) [testng-6.1.1.jar:?] at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291) [testng-6.1.1.jar:?] at org.testng.SuiteRunner.run(SuiteRunner.java:240) [testng-6.1.1.jar:?] at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52) [testng-6.1.1.jar:?] at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86) [testng-6.1.1.jar:?] at org.testng.TestNG.runSuitesSequentially(TestNG.java:1158) [testng-6.1.1.jar:?] at org.testng.TestNG.runSuitesLocally(TestNG.java:1083) [testng-6.1.1.jar:?] at org.testng.TestNG.run(TestNG.java:999) [testng-6.1.1.jar:?] at org.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:72) [testng-plugin.jar:?] at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:127) [testng-plugin.jar:?]Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms. at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.<init>(KafkaProducer.java:730) ~[kafka-clients-0.10.0.0.jar:?] at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:483) ~[kafka-clients-0.10.0.0.jar:?] at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:430) ~[kafka-clients-0.10.0.0.jar:?] at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:353) ~[kafka-clients-0.10.0.0.jar:?] at org.apache.atlas.kafka.KafkaNotification.sendInternalToProducer(KafkaNotification.java:219) ~[atlas-notification-0.9-SNAPSHOT.jar:0.9-SNAPSHOT] ... 42 moreCaused by: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.189137 [Thread-19-words-executor[8 8]] INFO o.a.s.d.task - Emitting: words default [nathan] Even when I try to run the test case for the builtin Hook for Storm I get the same error. Any idea how to get past this?
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache Kafka
09-13-2017
01:47 PM
Hi I have installed atlas naively (not HDP). I tried to use the REST API but I notice that the V2 API does not work. I have the latest build from https://git-wip-us.apache.org/repos/asf/atlas.git atlas Atlas is running locally. I have used the following configuration mvn clean package -Pdist,berkeley-elasticsearch I notice I keep getting the following warning in my application log: 2017-09-12 13:16:34,998 WARN - [main-SendThread(localhost:9026):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn$SendThread:1102)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
When i hit the URL below I get the expected response. localhost:21000/api/atlas/types But when I try localhost:21000/api/atlas/v2/types I get the following error in the log: 2017-09-12 13:16:35,000 ERROR - [pool-1-thread-7 - 32b40c72-e4ba-41cd-aded-62634756785a:] ~ Error handling a request: e454ceee5c6d9335 (ExceptionMapperUtil:32)
javax.ws.rs.WebApplicationException
at com.sun.jersey.server.impl.uri.rules.TerminatingRule.accept(TerminatingRule.java:66)
at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.apache.atlas.web.filters.AuditFilter.doFilter(AuditFilter.java:76)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at org.apache.atlas.web.filters.AtlasAuthorizationFilter.doFilter(AtlasAuthorizationFilter.java:157)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.apache.atlas.web.filters.AtlasCSRFPreventionFilter$ServletFilterHttpInteraction.proceed(AtlasCSRFPreventionFilter.java:235)
at org.apache.atlas.web.filters.AtlasCSRFPreventionFilter.handleHttpInteraction(AtlasCSRFPreventionFilter.java:177)
at org.apache.atlas.web.filters.AtlasCSRFPreventionFilter.doFilter(AtlasCSRFPreventionFilter.java:190)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.apache.atlas.web.filters.AtlasAuthenticationFilter.doFilter(AtlasAuthenticationFilter.java:340)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:170)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.apache.atlas.web.filters.AtlasKnoxSSOAuthenticationFilter.doFilter(AtlasKnoxSSOAuthenticationFilter.java:132)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.apache.atlas.web.filters.StaleTransactionCleanupFilter.doFilter(StaleTransactionCleanupFilter.java:55)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:215)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:200)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:116)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
The URL mentioned above is mentioned in https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_data-governance/content/atlas_types_api.html The REST API Page for atlas has the following URL https://atlas.apache.org/api/v2/types/typedefs when I try this URL i get the following message 404 message Looking for something? We're sorry. The web address you're looking for is not a functioning page in Apache Atlas. Please try navigating from Apache Atlas Home Any suggestions?
... View more
Labels:
- Labels:
-
Apache Atlas
08-24-2017
06:12 PM
Ok here are all the steps required to run Apache Atlas natively with Berkeley DB and Elastic: Download and install Kafka use the link : https://kafka.apache.org/downloads. Download the binary and extract to your required location. Kafka and Atlas would also require Zookeeper. By default kafka comes with an instance of zookeeper. If you do not have zookeeper running or installed, you can use this. Navigate to and run : kafkahome/bin/zookeeper-server-start.sh Once zookeeper has started you can check it using the command: netstat -ant | grep :2181. if everything is fine you should see: tcp6 0 0 :::2181 :::* LISTEN Now you can start your kafka server using the command: ./kafkaHOME/bin/kafka-server-start.sh /KafkaHome/config/server.properties To check if kafka is running run the command netstat -ant | grep :9092. You should see a similar result as mentioned above. Now you are ready to move on with ATLAS. You can either use the link provided on the website or do a branch and tag checkout directly from github. I used the command on their website: git clone https://git-wip-us.apache.org/repos/asf/atlas.git atlas navigate into the folder : cd atlas Create new folder called libext using: mkdir libext You need to download the jar file form this URL. http://download.oracle.com/otn/berkeley-db/je-5.0.73.zip You will need an oracle account. Create one to download the zip file. Copy this zip file into your libext folder that you just created. run command export MAVEN_OPTS="-Xmx1536m -XX:MaxPermSize=512m" run command mvn clean install -DskipTests (MAKE SURE TO USE SKIP TESTS ) run command: mvn clean package -DskipTests -Pdist,berkeley-elasticsearch Navigate to the following location: incubator-atlas/distro/target/apache-atlas-0.8-incubating-bin/apache-atlas-0.8-incubating/bin/atlas_start.py OR /home/arsalan/Development/atlas/distro/target/apache-atlas-0.9-SNAPSHOT-bin/apache-atlas-0.9-SNAPSHOT Depending on which repo you have used. Run the follwoing command python atlas_start.py You can now navigate to localhost:21000 to check Atlas GUI. Hope it helps!!!!!
... View more
08-24-2017
05:28 PM
I was not able to run it with hbase and solr. But the installation with berkely db and elastic works by simply following the installation steps on their website
... View more
08-14-2017
06:05 PM
@anaik I was trying with the following command: mvn clean package _DskipTests-Pdist,embedded-hbase-solr In this case Solr and Hbase are installed and started automatically when atlas starts. I did not know that I would need to install kafka separately. I do have it installed, I will run it and see if any parameter need to be set in atlas for kafka.
... View more
08-12-2017
09:51 PM
Hi
I tried to install apache atlas on my laptop unfortunately I keep on getting exceptions. Can any one make a tutorial to get atlas running on your laptop with embeded Solr and Hbase setting using the link: Apache Atlas Installation
How to configure it and what needs to be installed before hand (zookeeper). I just need a basic installation with defaults. It would be great to mention Checkout URL maven commands additional dependencies (softwares) configurations file locations and settings Once you download the repo there are multiple similar folders with the same content which I am confused about, regarding which one to use and where to run the scripts like start_atlas.py etc Thanks
... View more
Labels:
- Labels:
-
Apache Atlas
08-12-2017
09:23 PM
@Jay SenSharma I am not running atlas on HDP. I had a look into the conf dir. I have a hbase folder there. It has a hbase.site.xml template. i renamed the file. the file has the following contents: <configuration>
<property>
<name>hbase.rootdir</name>
<value>${url_prefix}${atlas_data}/hbase-root</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>${atlas_data}/hbase-zookeeper-data</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>61510</value>
</property>
<property>
<name>hbase.regionserver.info.port</name>
<value>61530</value>
</property>
<property>
<name>hbase.master.port</name>
<value>61500</value>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>61520</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase-unsecure</value>
</property>
</configuration>
<br> I restarted atlas but I get the same exception. Also I have installed zookeeper and it is running on port 2181. Also i am running things locally on my laptop. using default configurations in application properties: atlas-application.txt
... View more
08-12-2017
06:54 PM
@anaik @Ashutosh Mestry i checked and zookeeper is running and accepting connections. Attached is the log file:atlas20170812-204645.txt I am unable to proceed. Kindly help
... View more
08-11-2017
06:13 PM
@Hitesh Rajpurohit try: wget https://github.com/hortonworks/data-tutorials/blob/archive-hdp-2.5/tutorials/hdp/hdp-2.5/cross-component-lineage-with-apache-atlas-across-apache-sqoop-hive-kafka-storm/assets/crosscomponent_scripts.zip?raw=true
... View more
08-08-2017
07:58 AM
Thanks! that fixed it. i had to change the hostname in the file. now it works!
... View more
08-08-2017
07:05 AM
Hi I have installed HDF 3.0 on windows 7. I want to connect to the Vm via the browser (WEB SHELL CLIENT). But when i try to login to the sandbox i get the following error: ssh: Could not resolve hostname sandbox.hortonworks.com: Name or service not known I see it tries to look for the hostname: sandbox.hortonwoks.com, however the sandbox name in case of HDF is : sandbox-hdf.hortonworks.com. I have added this host name in windows/sys32/hosts file. Is there a setting where I need to make this change to make shell in a box work? How can I configure "shell in a box" to look for the correct hostname. Thanks
... View more
Labels:
- Labels:
-
Cloudera DataFlow (CDF)
08-05-2017
07:03 PM
@Jay SenSharma i did restart nifi so far displays nothing. attached is the new log file. ambari-metrics-collectorlog.tar.gz Also i still cant reach http://localhost:6188/ws/v1/timeline/metrics from my browser. Although now When use the following command : # netstat -tnlpa | grep 6188 I do get a bunch of items. After cleaning the data I see the metrics do work and I do see some graphs in the widgets like : CPU usage, Memory Usage, Disk IO on the main page for ambari but unfortunately I still dont see any data for NiFi. it still says no data available. I have the default settings for the reporting task in Nifi set to ${ambari.metrics.collector.url} ${ambari.application.id} ${hostname(true)} I also checked lsof -p `cat /var/run/nifi/nifi.pid ` | grep metrics and i do have a bunch of jar files. Any idea why cant i see the NiFi metrics and why cant i hit the URL.
... View more
08-05-2017
05:58 PM
@Jay SenSharma thanks for the reply. the value is set correctly. Is ambari dependent on any other thing? ambari infra? like do i need to start any other service aswell.... I followed the instructions on the link above and cleared all the data.. Now i no longer seem to get the error in the NiFi bulletin but i do not see any metrics on the Nifi page in ambari. it still displays no data available . I did run a flow to see if it gets updated.
... View more
08-05-2017
04:26 PM
@Jay SenSharma thanks for the reply. I dont see anything when i run the first command. Attached is the log file:hdflog.txt it seems to be a zookeeper issue. Any idea?
... View more