Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can't import from Hive to Atlas: import-hive.sh termintaing in error

Highlighted

Can't import from Hive to Atlas: import-hive.sh termintaing in error

New Contributor

Hi,

I’m in HDP 2.6 (a fresh install).

All the services are up and running.

Kafka is running and topics are there:

[root@ambariserver bin]# ./kafka-topics.sh --list --zookeeper localhost:2181

ATLAS_ENTITIES

ATLAS_HOOK

__consumer_offsets

As i can’t see in Atlas the new tables that i created in Hive, I used import-hive.sh to synchronize Hive and Atlas.

The script ended in a error due a timeout.

Can someone explain which are the synchronization steps?

[root@ambariserver hook-bin]# ./import-hive.sh

Atlas Log Dir = /usr/hdp/current/atlas-server/logs

Using Hive configuration directory [/etc/hive/conf]

Log file for import is /usr/hdp/current/atlas-server/logs/import-hive.log

log4j:WARN Continuable parsing error 88 and column 23

log4j:WARN The content of element type "log4j:configuration" must match "(renderer*,throwableRenderer?,appender*,plugin*,(category|logger)*,root?,(categoryFactory|loggerFactory)?)".

2017-06-28 10:02:28,825 INFO- [main:] ~ Looking for atlas-application.properties in classpath (ApplicationProperties:78)

2017-06-28 10:02:28,831 INFO- [main:] ~ Loading atlas-application.properties from file:/etc/hive/2.6.0.3-8/0/atlas-application.properties (ApplicationProperties:91)

2017-06-28 10:02:28,898 DEBUG - [main:] ~ Configuration loaded: (ApplicationProperties:104)

2017-06-28 10:02:28,898 DEBUG - [main:] ~ atlas.authentication.method.kerberos = False (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.cluster.name = hws_cluster (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.hook.hive.keepAliveTime = 10 (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.hook.hive.maxThreads = 5 (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.hook.hive.minThreads = 5 (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.hook.hive.numRetries = 3 (ApplicationProperties:107)

2017-06-28 10:02:28,902 DEBUG - [main:] ~ atlas.hook.hive.queueSize = 1000 (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.hook.hive.synchronous = false (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.kafka.bootstrap.servers = ambariserver.com:6667 (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.kafka.hook.group.id = atlas (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.kafka.zookeeper.connect = ambariserver.com:2181 (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.kafka.zookeeper.connection.timeout.ms = 30000 (ApplicationProperties:107)

2017-06-28 10:02:28,903 DEBUG - [main:] ~ atlas.kafka.zookeeper.session.timeout.ms = 60000 (ApplicationProperties:107)

2017-06-28 10:02:28,906 DEBUG - [main:] ~ atlas.kafka.zookeeper.sync.time.ms = 20 (ApplicationProperties:107)

2017-06-28 10:02:28,906 DEBUG - [main:] ~ atlas.notification.create.topics = True (ApplicationProperties:107)

2017-06-28 10:02:28,906 DEBUG - [main:] ~ atlas.notification.replicas = 1 (ApplicationProperties:107)

2017-06-28 10:02:28,906 DEBUG - [main:] ~ atlas.notification.topics = [ATLAS_HOOK, ATLAS_ENTITIES] (ApplicationProperties:107)

2017-06-28 10:02:28,906 DEBUG - [main:] ~ atlas.rest.address = http://ambariserver.com:21000 (ApplicationProperties:107)

2017-06-28 10:02:28,908 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:173)

2017-06-28 10:02:28,910 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:186)

2017-06-28 10:02:28,919 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.initialize() (InMemoryJAASConfiguration:243)

2017-06-28 10:02:28,919 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.initialize({}) (InMemoryJAASConfiguration:370)

2017-06-28 10:02:28,920 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:195)

2017-06-28 10:02:28,920 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:182)

Enter username for atlas :-

admin

Enter password for atlas :-

admin

2017-06-28 10:02:42,769 INFO- [main:] ~ Client has only one service URL, will use that for all actions: http://ambariserver.com:21000 (AtlasBaseClient:201)

2017-06-28 10:02:43,624 WARN- [main:] ~ Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (NativeCodeLoader:62)

2017-06-28 10:02:43,842 INFO- [main:] ~ Importing hive metadata (HiveMetaStoreBridge:133)

2017-06-28 10:02:43,845 DEBUG - [main:] ~ Getting reference for database default (HiveMetaStoreBridge:227)

2017-06-28 10:02:43,847 DEBUG - [main:] ~ Using resource http://ambariserver.com:21000/api/atlas/entities?type=hive_db&property=qualifiedName&value=default@h... for 0 times (AtlasBaseClient:413)

2017-06-28 10:02:43,848 DEBUG - [main:] ~ Calling API [ GET : api/atlas/entities ](AtlasBaseClient:295)

2017-06-28 10:02:43,894 DEBUG - [main:] ~ API http://ambariserver.com:21000/api/atlas/entities?type=hive_db&property=qualifiedName&value=default@h... returned status 200 (AtlasBaseClient:303)

2017-06-28 10:02:43,902 INFO- [main:] ~ Response = {"requestId":"pool-2-thread-5 - 498099ab-61ec-4383-8464-150adedd27bf","definition":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference","id":{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id","id":"bd119d9a-d111-4755-b3d5-b7d0e6329f99","version":0,"typeName":"hive_db","state":"ACTIVE"},"typeName":"hive_db","values":{"name":"default","location":"hdfs:\/\/ambariserver.com:8020\/apps\/hive\/warehouse","description":"Default Hive database","ownerType":{"value":"ROLE","ordinal":2},"qualifiedName":"default@hws_cluster","owner":"public","clusterName":"hws_cluster","parameters":null},"traitNames":[],"traits":{},"systemAttributes":{"createdBy":"ambari-qa","modifiedBy":"admin","createdTime":"2017-05-29T15:28:51.528Z","modifiedTime":"2017-06-28T13:59:34.905Z"}}} (AtlasBaseClient:315)

2017-06-28 10:02:44,482 INFO- [main:] ~ Database default is already registered with id bd119d9a-d111-4755-b3d5-b7d0e6329f99. Updating it. (HiveMetaStoreBridge:173)

2017-06-28 10:02:44,482 INFO- [main:] ~ Importing objects from databaseName : default (HiveMetaStoreBridge:182)

2017-06-28 10:02:44,482 DEBUG - [main:] ~ updating instance of type hive_db (HiveMetaStoreBridge:521)

2017-06-28 10:02:44,494 DEBUG - [main:] ~ Updating entity hive_db = {

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",

"id":{

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",

"id":"bd119d9a-d111-4755-b3d5-b7d0e6329f99",

"version":0,

"typeName":"hive_db",

"state":"ACTIVE"

},

"typeName":"hive_db",

"values":{

"name":"default",

"location":"hdfs://ambariserver.com:8020/apps/hive/warehouse",

"description":"Default Hive database",

"ownerType":2,

"qualifiedName":"default@hws_cluster",

"owner":"public",

"clusterName":"hws_cluster",

"parameters":{

}

},

"traitNames":[

],

"traits":{

},

"systemAttributes":{

"createdBy":"ambari-qa",

"modifiedBy":"admin",

"createdTime":"2017-05-29T15:28:51.528Z",

"modifiedTime":"2017-06-28T13:59:34.905Z"

}

} (HiveMetaStoreBridge:524)

2017-06-28 10:02:44,496 DEBUG - [main:] ~ Updating entity id bd119d9a-d111-4755-b3d5-b7d0e6329f99 with {

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",

"id":{

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",

"id":"bd119d9a-d111-4755-b3d5-b7d0e6329f99",

"version":0,

"typeName":"hive_db",

"state":"ACTIVE"

},

"typeName":"hive_db",

"values":{

"name":"default",

"location":"hdfs://ambariserver.com:8020/apps/hive/warehouse",

"description":"Default Hive database",

"ownerType":2,

"qualifiedName":"default@hws_cluster",

"owner":"public",

"clusterName":"hws_cluster",

"parameters":{

}

},

"traitNames":[

],

"traits":{

},

"systemAttributes":{

"createdBy":"ambari-qa",

"modifiedBy":"admin",

"createdTime":"2017-05-29T15:28:51.528Z",

"modifiedTime":"2017-06-28T13:59:34.905Z"

}

} (AtlasClient:582)

2017-06-28 10:02:44,496 DEBUG - [main:] ~ Calling API [ POST : api/atlas/entities ] <== {

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",

"id":{

"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",

"id":"bd119d9a-d111-4755-b3d5-b7d0e6329f99",

"version":0,

"typeName":"hive_db",

"state":"ACTIVE"

},

"typeName":"hive_db",

"values":{

"name":"default",

"location":"hdfs://ambariserver.com:8020/apps/hive/warehouse",

"description":"Default Hive database",

"ownerType":2,

"qualifiedName":"default@hws_cluster",

"owner":"public",

"clusterName":"hws_cluster",

"parameters":{

}

},

"traitNames":[

],

"traits":{

},

"systemAttributes":{

"createdBy":"ambari-qa",

"modifiedBy":"admin",

"createdTime":"2017-05-29T15:28:51.528Z",

"modifiedTime":"2017-06-28T13:59:34.905Z"

}

} (AtlasBaseClient:295)

Exception in thread "main" org.apache.atlas.hook.AtlasHookException: HiveMetaStoreBridge.main() failed.

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:650)

Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out

at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)

at com.sun.jersey.api.client.filter.HTTPBasicAuthFilter.handle(HTTPBasicAuthFilter.java:81)

at com.sun.jersey.api.client.Client.handle(Client.java:648)

at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)

at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)

at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:623)

at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:297)

at org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:287)

at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:429)

at org.apache.atlas.AtlasClient.callAPIWithBodyAndParams(AtlasClient.java:1006)

at org.apache.atlas.AtlasClient.updateEntity(AtlasClient.java:583)

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.updateInstance(HiveMetaStoreBridge.java:526)

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.registerDatabase(HiveMetaStoreBridge.java:175)

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:140)

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:134)

at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:647)

Caused by: java.net.SocketTimeoutException: Read timed out

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)

at java.net.SocketInputStream.read(SocketInputStream.java:171)

at java.net.SocketInputStream.read(SocketInputStream.java:141)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)

at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)

at java.io.BufferedInputStream.read(BufferedInputStream.java:345)

at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)

at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)

at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)

at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)

at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:240)

at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)

... 15 more

Failed to import Hive Data Model!!!

4 REPLIES 4

Re: Can't import from Hive to Atlas: import-hive.sh termintaing in error

Expert Contributor

@Farhad Heybati

Your hive import script worked properly but Atlas server running on port 21000 seems to have the issue.

Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out

Issue might be Atlas server might be down or Hbase, Solr, Kafka service might be down or unresponsive.

Please check whether these services are up and running

Re: Can't import from Hive to Atlas: import-hive.sh termintaing in error

New Contributor

I can see an error in Ranger:

--07/11/2017 11:19:34 AMANONYMOUShws_cluster_kafka kafka ATLAS_HOOKtopic describeDeniedranger-acl
--07/11/2017 11:19:34 AMANONYMOUS hws_cluster_kafka kafka ATLAS_ENTITIEStopic describeDeniedranger-acl

This might be the source of the error?

Re: Can't import from Hive to Atlas: import-hive.sh termintaing in error

Contributor
@Farhad Heybati

You can use below command to check the authorization on ATLAS_HOOK topic:

Note- Below command is run by kafka user.

/usr/hdp/current/kafka-broker/bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=<ZK_HOSTNAME>:2181 --list --topic ATLAS_HOOK

Below is a useful doc link:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_data-governance/content/ch_hdp_data_gove...

In Ranger enabled environments, need to create Kafka policies:

topic=ATLAS_HOOK 
permission=publish, create; group=public 
permission=consume, create; user=atlas (for non-kerberized environments, set group=public) 

topic=ATLAS_ENTITIES 
permission=publish, create; user=atlas (for non-kerberized environments, set group=public) 
permission=consume, create; group=public 


You can use below command to provide access:

command can only run by the kafka user.

/usr/hdp/current/kafka-broker/bin/kafka-acls.sh --topic ATLAS_HOOK --allow-principals * --operations All --authorizer-properties "zookeeper.connect=hostname:2181"
/usr/hdp/current/kafka-broker/bin/kafka-acls.sh --topic ATLAS_ENTITIES --allow-principals * --operations All --authorizer-properties "zookeeper.connect=hostname:2181"

Re: Can't import from Hive to Atlas: import-hive.sh termintaing in error

New Contributor

Thank you for your response.

I think I already did all above, but I will verify.

One point that I don't understand is the fact that import make an ANONYMUS connection to kafka topic. I'm running the import as a root user. this might explain the anonymus connection?

Don't have an account?
Coming from Hortonworks? Activate your account here