Member since
01-11-2018
33
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3191 | 04-13-2018 08:38 AM |
07-06-2020
09:54 AM
Just wondering if you found a workaround for this? I think this is a known bug in Hive 1.1, but unfortunately upgrading Hive is not an option for us right now. https://issues.apache.org/jira/browse/HIVE-14555
... View more
03-17-2019
09:03 PM
This enables the diskbalancer feature on a cluster. By default, disk balancer is disabled. then why is your config + <value>false</value>
... View more
02-14-2019
12:19 AM
Great! Thank you very much!
... View more
01-23-2019
07:38 AM
Hi Bimalc, thank you very much for your answer. At this moment I can only confirm that fs.namenode.delegation.token.max-lifetime is set to 7 days. We use gobblin keytab and have experimented with different settings of gobblin.yarn.login.interval.minutes and gobblin.yarn.token.renew.interval.minutes on gobblin side, but with no success yet. I've started a new run of gobblin now, so we'll need to wait some time for the next failure. I'll check logs against possible token renewal errors or any other suspicious symptomps and get back in this thread with results. Thanks!
... View more
10-26-2018
06:57 AM
Hi, actually both session and operation timeouts are set to 6h, so this shouldn't be a problem. Thanks!
... View more
07-17-2018
03:13 AM
Hi @bgooley, It's working, thank you. Just wanted to add that one may add this configuration to Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini instead and make this configuration piece different for all instances of the Hue servers. Thanks!
... View more
05-08-2018
04:18 AM
Hi @Harsh J, thanks a million for such thorough and elaborate answer. I haven't solved the problem yet, probably I will apply cgroups configuration as suggested. I hope it's going to work however the reason why single JVM uses so much CPU is misterious to me. I understand that yarn treats vcore as rough forecaster of how much CPU time will be used and probably we could mitigate problem by putting more vcores in job's application or reducing number of running containers on the node in other way, but still we wouldn't be guaranteed that some of the containers wouldn't use even more CPU up to total capacity of the server. It looks like having containers running many threads resulting in CPU share more than 100% per container undermines the concept how yarn dispatch tasks to the nodes. I've also come across this tutorial: https://hortonworks.com/blog/apache-hadoop-yarn-in-hdp-2-2-isolation-of-cpu-resources-in-your-hadoop-yarn-clusters/ which includes: " how do we ensure that containers don’t exceed their vcore allocation? What’s stopping an errant container from spawning a bunch of threads and consume all the CPU on the node?" It appears that in the past the rule of thumb 1+vcore/1real core worked ( I saw it several older tutorials) but now we have different patterns of the workload (less IO dependant/more CPU consuming) and this rule doesn't work very well now So effectively cgroups seem to be the only solution to ensure containers don't exceed their vcore allocation. Let me know if you agree or see other solutions. Thanks a million!
... View more
04-13-2018
08:38 AM
I've found the solution - it's possible to use another parameter to prevent user from using mapred.job.queuename: hive.conf.restricted.list
... View more
04-09-2018
08:51 AM
Hi @Harsh J, thank you for even more thorough answer, placement policy is clear now. I didn't see the risk of rogue Yarn apps before, it's very helpful. Many thanks!
... View more