Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Atlas not tracking Hive lineage

Solved Go to solution

Atlas not tracking Hive lineage

New Contributor

I'm using the HDP Sandbox with HDP 2.5 and Atlas 0.7 in a VirtualBox (12gB RAM allocated for the VM). To install Atlas I started HBase, installed and started Ambari Infra (solr) and then installed and started Atlas. Everything works as intended so far.

However when I'm creating tables in Hive and adding data then the data is not tracked in Atlas. I tried setting the hooks.synchronous configuration of hive to true, but nothing changed.

Trying to use the import-hive.sh yields the following errors

[root@sandbox ~]# /usr/hdp/current/atlas-server/hook-bin/import-hive.sh
Using Hive configuration directory [/etc/hive/conf]
Log file for import is /usr/hdp/current/atlas-server/logs/import-hive.log
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further detail                               s.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/co                               nfiguration/PropertiesConfiguration
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14                               2)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBr                               idge.java:636)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.Pr                               opertiesConfiguration
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 13 more
Failed to import Hive Data Model!!!

Is it due to the Logger? In the internet they tell me to copy over the required jars into the atlas/hadoop classpath, but

[root@sandbox ~]# find /usr/ -name slf*.jar -print
/usr/hdp/share/hst/hst-common/lib/slf4j-api-1.7.5.jar
/usr/hdp/share/hst/hst-common/lib/slf4j-log4j12-1.7.5.jar
/usr/hdp/2.5.0.0-1245/atlas/hook/kafka-topic-setup/slf4j-api-1.7.7.jar
/usr/hdp/2.5.0.0-1245/atlas/hook/kafka-topic-setup/slf4j-log4j12-1.7.7.jar
/usr/hdp/2.5.0.0-1245/atlas/server/webapp/atlas/WEB-INF/lib/slf4j-api-1.7.7.jar
/usr/hdp/2.5.0.0-1245/atlas/server/webapp/atlas/WEB-INF/lib/slf4j-log4j12-1.7.7.jar

...

shows that the jars _should_ be present.

Following this Post (https://community.hortonworks.com/questions/63284/failed-to-import-hive-data-model.html) helped me get across this.

However I received another REST response error

2016-11-08 16:03:15,383 DEBUG - [main:] ~ Creating entities: ["{\n  \"jsonClass\":\"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference\",\n  \"id\":{\n    \"jsonClass\":\"org.apache.atlas.typesystem.json.InstanceSerialization$_Id\",\n    \"id\":\"-7591808153364\",\n    \"version\":0,\n    \"typeName\":\"hive_db\",\n    \"state\":\"ACTIVE\"\n  },\n  \"typeName\":\"hive_db\",\n  \"values\":{\n    \"name\":\"default\",\n    \"location\":\"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/warehouse\",\n    \"description\":\"Default Hive database\",\n    \"ownerType\":2,\n    \"qualifiedName\":\"default@Sandbox\",\n    \"owner\":\"public\",\n    \"clusterName\":\"Sandbox\",\n    \"parameters\":{\n      \n    }\n  },\n  \"traitNames\":[\n    \n  ],\n  \"traits\":{\n    \n  }\n}"] (AtlasClient:694)
2016-11-08 16:03:15,390 DEBUG - [main:] ~ Using resource http://sandbox.hortonworks.com:21000/api/atlas/entities for 0 times (AtlasClient:784)
2016-11-08 16:04:15,485 WARN  - [main:] ~ Handled exception in calling api api/atlas/entities (AtlasClient:791)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
        at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81)
        at com.sun.jersey.api.client.Client.handle(Client.java:648)
        at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
        at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
        at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623)
        at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188)
        at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785)
        at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214)
        at org.apache.atlas.AtlasClient.createEntity(AtlasClient.java:695)
        at org.apache.atlas.AtlasClient.createEntity(AtlasClient.java:712)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerInstance(HiveMetaStoreBridge.java:197)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:155)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240)
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
        ... 15 more
2016-11-08 16:04:15,490 WARN  - [main:] ~ Exception's cause: class java.net.SocketTimeoutException (AtlasClient:792)
Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
        at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81)
        at com.sun.jersey.api.client.Client.handle(Client.java:648)
        at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
        at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
        at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623)
        at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188)
        at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785)
        at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214)
        at org.apache.atlas.AtlasClient.createEntity(AtlasClient.java:695)
        at org.apache.atlas.AtlasClient.createEntity(AtlasClient.java:712)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerInstance(HiveMetaStoreBridge.java:197)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:155)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118)
        at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240)
        at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
        ... 15 more
Failed to import Hive Data Model!!!

So it fails to create the entities

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Atlas not tracking Hive lineage

@Jannik Franz

You don't say whether you started the Kafka service. Kafka the messaging system used to keep everything in sync. You can read more about the configuration here http://atlas.incubator.apache.org/Configuration.html and architecture here http://atlas.incubator.apache.org/Architecture.html.

Can you confirm that Kafka is started? If it is not, can you start Kafka and try to repeat the process?

View solution in original post

2 REPLIES 2
Highlighted

Re: Atlas not tracking Hive lineage

@Jannik Franz

You don't say whether you started the Kafka service. Kafka the messaging system used to keep everything in sync. You can read more about the configuration here http://atlas.incubator.apache.org/Configuration.html and architecture here http://atlas.incubator.apache.org/Architecture.html.

Can you confirm that Kafka is started? If it is not, can you start Kafka and try to repeat the process?

View solution in original post

Highlighted

Re: Atlas not tracking Hive lineage

New Contributor

Thanks. It didn't mention I needed Kafka. After starting it worked fine!

I received another error, unrelated to this and will open another thread.

Don't have an account?
Coming from Hortonworks? Activate your account here