Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive Hooks - What are the recommended values when both Hive and Atlas are installed ?

avatar

Environment - HDP-2.6.2 with Hive and Atlas on the cluster.

Current Config(based on HDP Doc 😞

hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook

hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook, org.apache.atlas.hive.hook.HiveHook

hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook

Scenario - Often, simple hive query like 'show databases' is failing with the following error:

2017-10-10 14:00:38,985 INFO  [HiveServer2-Background-Pool: Thread-273112]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG method=PostHook.org.apache.atlas.hive.hook.HiveHook from=org.apache.hadoop.hive.ql.Driver>
2017-10-10 14:00:38,986 ERROR [HiveServer2-Background-Pool: Thread-273112]: ql.Driver (SessionState.java:printError(962)) - FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807])
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807]
  at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
  at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
  at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
  at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
  at org.apache.atlas.hive.hook.HiveHook.run(HiveHook.java:174)

I found HCC posts that mention to update the config as shown below:

hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook

This has solved the problem in the HCC post, however, I want to understand:

1. what is the significance of this values mentioned in the document ?

2. If removing the value solves the problem, then is the document incorrect ?
3. What impact does updating the value as per HCC post have on functioning of Atlas ?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This is happening because ThreadPool is configured with max pool size = 1 & waiting queue size = 10000. Since 10000 itself is too high, its better to increase no. of threads in thread pool.

Below is the property in Atlas to increase max threads

atlas.hook.hive.maxThreads

atlas.hook.hive.queueSize (default 10000), not recommented to increase.

If you are facing this issue even after increasing atlas.hook.hive.maxThreads, i suggest to add new HiveServer2 HA node for better load balancing.

View solution in original post

3 REPLIES 3

avatar
Rising Star
@Dinesh Chitlangia

Since you have Atlas configured, you need to have Atlas Hook specified in hive.exec.post.hooks

Here the issue looks to be due large number of queued tasks so you may want to check why there are so many queued tasks

java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807

avatar
Expert Contributor

This is happening because ThreadPool is configured with max pool size = 1 & waiting queue size = 10000. Since 10000 itself is too high, its better to increase no. of threads in thread pool.

Below is the property in Atlas to increase max threads

atlas.hook.hive.maxThreads

atlas.hook.hive.queueSize (default 10000), not recommented to increase.

If you are facing this issue even after increasing atlas.hook.hive.maxThreads, i suggest to add new HiveServer2 HA node for better load balancing.

avatar

thank you. I had to increase the maxThreads to get it working.