Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to import hive metadata in Apache Atlas ??

avatar
Expert Contributor

Hey Guys,

I am trying to import hive metadata in Apache atlas, i ran import-hive.sh but ran into below notification.

Thank you,

Subash

1 ACCEPTED SOLUTION

avatar
Master Guru

I ran into this timeout before. I found I had ranger enabled on kafka and the api would time out. can you verify you have ranger disabled on kafka topic, or give correct permission to atlas user.

View solution in original post

10 REPLIES 10

avatar
Expert Contributor

/usr/lib/jvm/jre/bin/java and/or /usr/lib/jvm/jre/bin/jar not found on the system. Please make sure java and jar commands are available.

avatar
Super Collaborator

Hi @subash sharma,

Make sure your Java environment is configured properly - for example JAVA_HOME.

Please have a look at https://community.hortonworks.com/questions/39839/how-to-import-metadata-from-hive-into-atlas-and-th...

/Best regards, Mats

avatar
Expert Contributor

hey @Mats Johansson , as you said I have configured the JAVA_HOME path, Now i am running into time-out issue.Please find the log below:

sh import-hive.sh /usr/lib/jvm/java/bin/java /usr/lib/jvm/java/bin/jar Using Hive configuration directory [/etc/hive/conf] Log file for import is /usr/hdp/2.5.3.0-37/atlas/logs/import-hive.log 2016-12-12 14:28:03,318 INFO - [main:] ~ Looking for atlas-application.propert ies in classpath (ApplicationProperties:73) 2016-12-12 14:28:03,322 INFO - [main:] ~ Loading atlas-application.properties from file:/etc/hive/2.5.3.0-37/0/atlas-application.properties (ApplicationPrope rties:86) 2016-12-12 14:28:03,374 DEBUG - [main:] ~ Configuration loaded: (ApplicationPro perties:99) 2016-12-12 14:28:03,374 DEBUG - [main:] ~ atlas.authentication.method.kerberos = False (ApplicationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.cluster.name = governance (Appl icationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.keepAliveTime = 10 (A pplicationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.maxThreads = 5 (Appli cationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.minThreads = 5 (Appli cationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.numRetries = 3 (Appli cationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.queueSize = 1000 (App licationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.hook.hive.synchronous = false ( ApplicationProperties:102) 2016-12-12 14:28:03,376 DEBUG - [main:] ~ atlas.kafka.bootstrap.servers = (ApplicationProperties:102) 2016-12-12 14:28:03,377 DEBUG - [main:] ~ atlas.kafka.hook.group.id = atlas (Ap plicationProperties:102) 2016-12-12 14:28:03,377 DEBUG - [main:] ~ atlas.kafka.zookeeper.connect = (ApplicationProperties:102) 2016-12-12 14:28:03,377 DEBUG - [main:] ~ atlas.kafka.zookeeper.connection.time out.ms = 200 (ApplicationProperties:102) 2016-12-12 14:28:03,377 DEBUG - [main:] ~ atlas.kafka.zookeeper.session.timeout .ms = 400 (ApplicationProperties:102) 2016-12-12 14:28:03,379 DEBUG - [main:] ~ atlas.kafka.zookeeper.sync.time.ms = 20 (ApplicationProperties:102) 2016-12-12 14:28:03,379 DEBUG - [main:] ~ atlas.notification.create.topics = Tr ue (ApplicationProperties:102) 2016-12-12 14:28:03,379 DEBUG - [main:] ~ atlas.notification.replicas = 1 (Appl icationProperties:102) 2016-12-12 14:28:03,379 DEBUG - [main:] ~ atlas.notification.topics = [ATLAS_HO OK, ATLAS_ENTITIES] (ApplicationProperties:102) 2016-12-12 14:28:03,379 DEBUG - [main:] ~ atlas.rest.address = (ApplicationProperties:102) 2016-12-12 14:28:03,380 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:168) 2016-12-12 14:28:03,383 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:181) 2016-12-12 14:28:03,387 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.initial ize() (InMemoryJAASConfiguration:220) 2016-12-12 14:28:03,387 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.initial ize() (InMemoryJAASConfiguration:347) 2016-12-12 14:28:03,387 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:190) 2016-12-12 14:28:03,387 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:177) Enter username for atlas :- admin Enter password for atlas :- admin 2016-12-12 14:28:09,373 INFO - [main:] ~ Client has only one service URL, will use that for all actions: http://localhost:21000 (AtlasCli ent:265) 2016-12-12 14:28:10,427 WARN - [main:] ~ Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (NativeCodeLoa der:62) 2016-12-12 14:28:10,778 DEBUG - [main:] ~ Using resource localhost:21000/api/atlas/types/hdfs_path for 0 times (AtlasClient:784) 2016-12-12 14:28:10,804 DEBUG - [main:] ~ API localhost:21000/api/atlas/types/hdfs_path returned status 200 (AtlasClient:1191) 2016-12-12 14:28:11,010 INFO - [main:] ~ HDFS data model is already registered ! (HiveMetaStoreBridge:609) 2016-12-12 14:28:11,011 DEBUG - [main:] ~ Using resource localhost:21000/api/atlas/types/hive_process for 0 times (AtlasClient:784) 2016-12-12 14:28:11,022 DEBUG - [main:] ~ API localhost:21000/api/atlas/types/hive_process returned status 200 (AtlasClient:1191) 2016-12-12 14:28:11,042 INFO - [main:] ~ Hive data model is already registered ! (HiveMetaStoreBridge:624) 2016-12-12 14:28:11,042 INFO - [main:] ~ Importing hive metadata (HiveMetaStor eBridge:117) 2016-12-12 14:28:11,045 DEBUG - [main:] ~ Getting reference for database defaul t (HiveMetaStoreBridge:211) 2016-12-12 14:28:11,046 DEBUG - [main:] ~ Using resource localhost:21000/api/atlas/entities?type=hive_db&property=qualifiedName&val ue=default@governance for 0 times (AtlasClient:784) 2016-12-12 14:28:11,068 DEBUG - [main:] ~ API localhost:21000/api/atlas/entities?type=hive_db&property=qualifiedName&value=default@ governance returned status 200 (AtlasClient:1191) 2016-12-12 14:28:11,687 INFO - [main:] ~ Database default is already registered with id bdd91811-250d-43c7-b814-b90124960f5a. Updating it. (HiveMetaStoreBridge:157) 2016-12-12 14:28:11,687 INFO - [main:] ~ Importing objects from databaseName : default (HiveMetaStoreBridge:166) 2016-12-12 14:28:11,688 DEBUG - [main:] ~ updating instance of type hive_db (HiveMetaStoreBridge:501) 2016-12-12 14:28:11,711 DEBUG - [main:] ~ Updating entity hive_db = { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"bdd91811-250d-43c7-b814-b90124960f5a", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://localhost:8020/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@governance", "owner":"public", "clusterName":"governance", "parameters":{ } }, "traitNames":[ ], "traits":{ } } (HiveMetaStoreBridge:504) 2016-12-12 14:28:11,717 DEBUG - [main:] ~ Updating entity id bdd91811-250d-43c7-b814-b90124960f5a with { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"bdd91811-250d-43c7-b814-b90124960f5a", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://localhost:8020/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@governance", "owner":"public", "clusterName":"governance", "parameters":{ } }, "traitNames":[ ], "traits":{ } } (AtlasClient:807) 2016-12-12 14:28:11,717 DEBUG - [main:] ~ Using resource http://localhost:21000/api/atlas/entities/bdd91811-250d-43c7-b814-b90124960f5a for 0 times (AtlasClient:784) 2016-12-12 14:29:11,766 WARN - [main:] ~ Handled exception in calling api api/atlas/entities (AtlasClient:791) com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623) at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188) at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785) at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214) at org.apache.atlas.AtlasClient.updateEntity(AtlasClient.java:808) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.updateInstance(HiveMetaStoreBridge.java:506) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:159) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1371) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147) ... 14 more 2016-12-12 14:29:11,768 WARN - [main:] ~ Exception's cause: class java.net.SocketTimeoutException (AtlasClient:792) Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623) at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:1188) at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:785) at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:1214) at org.apache.atlas.AtlasClient.updateEntity(AtlasClient.java:808) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.updateInstance(HiveMetaStoreBridge.java:506) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:159) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:124) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:118) at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:662) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1371) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147) ... 14 more Failed to import Hive Data Model!!!

avatar
Master Guru

I ran into this timeout before. I found I had ranger enabled on kafka and the api would time out. can you verify you have ranger disabled on kafka topic, or give correct permission to atlas user.

avatar
Expert Contributor

Hey @Sunile Manjee Thank you, I disabled Kafka Ranger plugin and it worked. Now Atlas is capturing real-time changes occurring in hive .

avatar
Master Guru

No problem. enjoy atlas!

avatar
Contributor

There is another option that not require disable Ranger plugin. According to Atlas official documentation if the cluster has Ranger installed only is needed create a couple of Ranger policies for Kafka:

10295-ranger-policies.png

avatar
Master Mentor

@subash sharma

Did you succeed with the hive metadata import in Atlas?

avatar
Expert Contributor

@Geoffrey Shelton Okot,

Yes I was able to run the shell script after commenting "Exit 1". Code pasted below:

if [ ! -e "${JAVA_BIN}" ] || [ ! -e "${JAR_BIN}" ]; then  echo "$JAVA_BIN and/or $JAR_BIN not found on the system. Please make sure java and jar commands are available." 

# exit 1

fi