Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive database/table monitoring

Solved Go to solution

Hive database/table monitoring

Super Collaborator

Hi guys,

I am using hdp version 2.6.1.40-4 on our dev servers. The hive version is 1.2.1.

We use Hive tables as the source to our framework in which we read different columns from different tables and then we run some spark jobs to do processing. We maintain the config table in Hive in which we specify what columns we want from a source table. If someone changes the column name/add some new columns in their source table, we have to maintain this config table manually.

Please throw some ideas on what are the different exciting ways to monitor/track real-time what is happening in Hive metastore and what could be the most suitable push notification mechanism to alert us in any form? Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive database/table monitoring

Super Collaborator

Hi @Scott Shaw, thanks for that answer. I will take a look.

I realized this can be achieved by Atlas. The metadata changes would be picked up by HiveHook and it will send to ATLAS_HOOK topic of Kafka. I am working on comparing two options to consume this JSON message from the topic:

1) Connect with Nifi to filter that JSON and use PutEmail processor for notification

2) Write a custom Java Kafka Consumer that does the same thing as above.

Please let me know how you feel.

3 REPLIES 3
Highlighted

Re: Hive database/table monitoring

Hi @Mushtaq Rizvi, in thinking outloud, if you are looking at the Hive metastore and its running Oracle, MySQL, or MariaDB, I supposed you could create standard triggers to notify you when something changes. I know this can be done in SQL Server but I haven't explored the other RBMS options. Be careful how how this would affect performance depending on the rate of change.

I'm not aware of a solution native to Hive. Hive does not support triggers though there may be some better options once HPL/SQL is introduced into the Hive.

Please update this post if you find another solution.

Re: Hive database/table monitoring

Super Collaborator

Hi @Scott Shaw, thanks for that answer. I will take a look.

I realized this can be achieved by Atlas. The metadata changes would be picked up by HiveHook and it will send to ATLAS_HOOK topic of Kafka. I am working on comparing two options to consume this JSON message from the topic:

1) Connect with Nifi to filter that JSON and use PutEmail processor for notification

2) Write a custom Java Kafka Consumer that does the same thing as above.

Please let me know how you feel.

Re: Hive database/table monitoring

Hi @Mushtaq Rizvi, that sounds like a creative and good idea. I'm glad you are working something out that others can learn from. Thanks for posting!