Support Questions

Find answers, ask questions, and share your expertise

Why is Atlas not tracking Hive tables created through Ambari in HDP 2.5 sandbox

avatar
Guru

I imported data into Hive through the Ambari View. I don't see the table in Atlas. Any idea how to get Atlas to pick up my new tables and begin to track lineage?

1 ACCEPTED SOLUTION

avatar

@Vasilis Vagias

Issue seems to be with corrupted hive hook bin path. Copying the required jars from another location to the hive hook path and adding HADOOP CLASSPATH should fix the issue. With this import-hive.sh should import hive metadata to Atlas successfully.

View solution in original post

5 REPLIES 5

avatar

@Vasilis Vagias you may need to run import-hive.sh, found in /usr/hdp/current/atlas-server/hook-bin. Otherwise, there may be a communication issue, identifiable in the logs, as far as the Hive-Atlas bridge. See http://atlas.incubator.apache.org/Bridge-Hive.html

avatar

@Vasilis Vagias

Issue seems to be with corrupted hive hook bin path. Copying the required jars from another location to the hive hook path and adding HADOOP CLASSPATH should fix the issue. With this import-hive.sh should import hive metadata to Atlas successfully.

avatar
Guru

@Ayub Pathan I was able to get the script run but now I get the following error

2016-09-23 15:18:27,918 WARN  - [main:] ~ Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (NativeCodeLoader:62)
[main] INFO org.apache.atlas.hive.bridge.HiveMetaStoreBridge - HDFS data model is already registered!
[main] INFO org.apache.atlas.hive.bridge.HiveMetaStoreBridge - Hive data model is already registered!
[main] INFO org.apache.atlas.hive.bridge.HiveMetaStoreBridge - Importing hive metadata
[main] INFO org.apache.atlas.hive.bridge.HiveMetaStoreBridge - Database default is already registered with id c09b0a90-f9f6-4b1e-85ed-dd8aa617d44e. Updating it.
[main] INFO org.apache.atlas.hive.bridge.HiveMetaStoreBridge - Importing objects from databaseName : default
[main] WARN org.apache.atlas.AtlasClient - Handled exception in calling api api/atlas/entities
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81)
at com.sun.jersey.api.client.Client.handle(Client.java:648)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623)
at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188)
at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785)
at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214)
at org.apache.atlas.AtlasClient.updateEntity(AtlasClient.java:808)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.updateInstance(HiveMetaStoreBridge.java:506)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:159)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
... 14 more
[main] WARN org.apache.atlas.AtlasClient - Exception's cause: class java.net.SocketTimeoutException
Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81)
at com.sun.jersey.api.client.Client.handle(Client.java:648)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623)
at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188)
at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785)
at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214)
at org.apache.atlas.AtlasClient.updateEntity(AtlasClient.java:808)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.updateInstance(HiveMetaStoreBridge.java:506)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:159)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118)
at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
... 14 more
Failed to import Hive Data Model!!!

avatar

@Vasilis Vagias I am getting the same "SocketTimeoutException: Read timed out" when running import-hive.sh. Only the "default" db entity instance is created/updated before the timeout. Were you able to find a solution to this error? Thanks.

avatar
Super Collaborator

@Vasilis Vagias , you just have to go to Ambari=>Services=>Hive=>Configs and change the value of property atlas.hook.hive.synchronous to true. It is kept false by default. You can also follow the tutorial of Cross Component lineage where we talk about the lineage for MySQL-Sqoop-Hive and Kafka-Storm:

http://hortonworks.com/hadoop-tutorial/cross-component-lineage-apache-atlas/