Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Hive metadata does not not show up in Atlas with hook

avatar
New Member

I have Hive and Atlas installed on HDP 2.6.5 with Hive hook for Atlas enabled and no changes to the configuration. I am able to successfully import Hive metadata with import-hive.sh, but the Hive hook does not seem to work. When I create a database in Hive, it does not show up in Atlas.

The only things I see in the logs are

atlas/application.log:

ERROR - [pool-2-thread-5 - de30e17a-1db7-4aad-8f34-a61a27b33cff:] ~ graph rollback due to exception AtlasBaseException:Instance __AtlasUserProfile with unique attribute {name=admin} does not exist (GraphTransactionInterceptor:73)

hive logs:

./hiveserver2.log.2018-09-10:2018-09-10 14:55:42,587 INFO [HiveServer2-Background-Pool: Thread-61]: hook.AtlasHook (AtlasHook.java:<clinit>(99)) - Created Atlas Hook

./hiveserver2.log:2018-09-11 09:04:20,100 INFO [HiveServer2-Background-Pool: Thread-3201]: log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) - <PERFLOG method=PostHook.org.apache.atlas.hive.hook.HiveHook from=org.apache.hadoop.hive.ql.Driver>

./hiveserver2.log:2018-09-11 09:04:20,100 INFO [HiveServer2-Background-Pool: Thread-3201]: log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) - <PERFLOG method=PostHook.org.apache.atlas.hive.hook.HiveHook from=org.apache.hadoop.hive.ql.Driver>

When I search for the database name in /kafka-logs/, it only shows up in ./ATLAS_HOOK-0/00000000000000000014.log, and the entry looks like

{"msgSourceIP":"172.18.181.235","msgCreatedBy":"hive","msgCreationTime":1536681860109,"message":{"entities":{"referredEntities":{},"entities":[{"typeName":"hive_db","attributes":{"owner":"hive","ownerType":"USER","qualifiedName":"oyster9@bigcentos","clusterName":"bigcentos","name":"oyster9","location":"hdfs://host.com:8020/apps/hive/warehouse/oyster9.db","parameters":{}},"guid":"-82688521591125","version":0}]},"type":"ENTITY_CREATE_V2","user":"hive"},"version":{"version":"1.0.0"},"msgCompressionKind":"NONE","msgSplitIdx":1,"msgSplitCount":1}

Which tells me that the message about the new DB gets passed to the Kafka stream, but is not read by Atlas.

I do not know where to look next, and my goal is to make it so that when a database is created in Hive, it shows up in Atlas automatically.

1 ACCEPTED SOLUTION

avatar
New Member

Changing offsets.topic.replication.factor in Kafka config to 1 (number if brokers) addressed the issue.

View solution in original post

7 REPLIES 7

avatar
Master Mentor

@Maxim Neaga

Can you please check if you have the Hive Clients installed on Atlas Node?

Also can you please let us know if you have setup Ranger ? If yes, then have added proper policies/permissions?

avatar
New Member

@Jay Kumar SenSharma

Yes, I have Hive clients installed on all nodes.

I do have Ranger installed, but all plugins are disabled, so I do not think it affects it in any way.

I do see the messages about the new database posted into Kafka ATLAS_HOOK stream with

./kafka-console-consumer.sh --zookeeper localhost:2181 --topic ATLAS_HOOK --from-beginning

but for some reason, it does not work with --bootstrap-server:

./kafka-console-consumer.sh --bootstrap-server localhost:6667 --topic ATLAS_HOOK --from-beginning

[2018-09-12 14:41:32,010] WARN [Consumer clientId=consumer-1, groupId=console-consumer-67769] Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

avatar
Expert Contributor

@Maxim Neaga It is safe to ignore the error related to __AtlasUserProfile. Its a false positive.

avatar
New Member

Changing offsets.topic.replication.factor in Kafka config to 1 (number if brokers) addressed the issue.

avatar

Hi, I am facing same issue even after changing the para - offsets.topic.replication.factor to 1 in kafka conf. Note that I have CDP 7.1.7 and total 8 brokers. 

I am able to import using import-hive.sh but not using hook. Any suggestion will be appreciates.

Thanks, Syed.

avatar
Cloudera Employee

I was seeing similar errors before. Changing the replication to 1 for the offsets.topic.replication.factor property seemed to resolve the issue. I had only one Kafka broker. Dont see the errors now and see the hive tables come in.

avatar
New Member

Hi! Do you resolve the question?