Created 10-11-2017 12:52 AM
Environment - HDP-2.6.2 with Hive and Atlas on the cluster.
Current Config(based on HDP Doc 😞
hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook, org.apache.atlas.hive.hook.HiveHook hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook
Scenario - Often, simple hive query like 'show databases' is failing with the following error:
2017-10-10 14:00:38,985 INFO [HiveServer2-Background-Pool: Thread-273112]: log.PerfLogger (PerfLogger.java:PerfLogBegin(135)) - <PERFLOG method=PostHook.org.apache.atlas.hive.hook.HiveHook from=org.apache.hadoop.hive.ql.Driver>
2017-10-10 14:00:38,986 ERROR [HiveServer2-Background-Pool: Thread-273112]: ql.Driver (SessionState.java:printError(962)) - FAILED: Hive Internal Error: java.util.concurrent.RejectedExecutionException(Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807])
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3f389d45 rejected from java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at org.apache.atlas.hive.hook.HiveHook.run(HiveHook.java:174)
I found HCC posts that mention to update the config as shown below:
hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook
This has solved the problem in the HCC post, however, I want to understand:
1. what is the significance of this values mentioned in the document ?
2. If removing the value solves the problem, then is the document incorrect ?
3. What impact does updating the value as per HCC post have on functioning of Atlas ?
Created 10-13-2017 03:41 PM
This is happening because ThreadPool is configured with max pool size = 1 & waiting queue size = 10000. Since 10000 itself is too high, its better to increase no. of threads in thread pool.
Below is the property in Atlas to increase max threads
atlas.hook.hive.maxThreads
atlas.hook.hive.queueSize (default 10000), not recommented to increase.
If you are facing this issue even after increasing atlas.hook.hive.maxThreads, i suggest to add new HiveServer2 HA node for better load balancing.
Created 10-12-2017 04:43 AM
Since you have Atlas configured, you need to have Atlas Hook specified in hive.exec.post.hooks
Here the issue looks to be due large number of queued tasks so you may want to check why there are so many queued tasks
java.util.concurrent.ThreadPoolExecutor@5b868755[Running, pool size = 1, active threads = 1, queued tasks = 10000, completed tasks = 14807
Created 10-13-2017 03:41 PM
This is happening because ThreadPool is configured with max pool size = 1 & waiting queue size = 10000. Since 10000 itself is too high, its better to increase no. of threads in thread pool.
Below is the property in Atlas to increase max threads
atlas.hook.hive.maxThreads
atlas.hook.hive.queueSize (default 10000), not recommented to increase.
If you are facing this issue even after increasing atlas.hook.hive.maxThreads, i suggest to add new HiveServer2 HA node for better load balancing.
Created 10-18-2017 11:35 PM
thank you. I had to increase the maxThreads to get it working.