About jdu

jdu · ‎12-20-2016

Hi, we have an official document for deploy HDP on VMs: https://hortonworks.com/wp-content/uploads/2014/02/1514.Deploying-Hortonworks-Data-Platform-VMware-vSphere-0402161.pdf. It has reference link on HVE (NodeGroup) feature which include details you may want to know.

jdu · ‎09-13-2016

Hi Arun, FairScheduler is not HDP recommended/supported resource scheduler, so we don't have document to cover it. Please refer apache one: https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. I would like to add more background here: preemption feature is firstly added into CapacityScheduler which is quit mature and production ready. For FairScheduler, I am not exactly sure status (alpha or GA) but just notice several fixes are going on in community: https://issues.apache.org/jira/browse/YARN-4752 Any special reason to use FairScheduler? If not, you can also try preemption for CapacityScheduler. Here is doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_yarn_resource_mgt/content/preemption.html

jdu · ‎09-08-2016

This configuration is involved since MR v1. It serves as an up limit for DN locations of job split which intend to protect the JobTracker from overloaded by jobs with huge numbers of split locations. For YARN in Hadoop 2, this concern is lessened as we have per job AM instead of JT. However, it will still impact RM as RM will potentially see heavy request from the AM which tries to obtain many localities for the split. With hitting this limit, it will truncate location number to given limit with sacrifice a bit data locality but get rid of the risk to hit bottleneck of RM. Depends on your job's priority (I believer it is a per job configuration now), you can leave it as a default (for lower or normal priority job) or increase to a larger number. Increase this value to larger than DN number will be the same impact as set it to DN's number.

Online	Offline
Last Visited	‎09-08-2017 09:25 PM

Member Since	‎12-09-2015 05:53 PM
Last Visited	‎09-08-2017 09:25 PM
Posts	6
Kudos received	13

Cloudera Community

Re: Do we have best practice, documentation or art...

Re: Configure preemption in Yarn Fair Scheduler

Re: what's the recommended value of mapreduce.job....

Re: Do we have best practice, documentation or art...

Re: Configure preemption in Yarn Fair Scheduler

Re: what's the recommended value of mapreduce.job....