Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

yarn-utils.py and DISKS

Highlighted

yarn-utils.py and DISKS

Rising Star

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-installation/content/determi...

For the yarn-utils.py,need to input CPU cores and DISKS on the hosts, how to count the cpu cores on host? just physical cpu? or cores or hyperthreads?

and especially for the DISKS, is it total phsical disks on the host? include OS installed disk? or just the disks for the HDFS? How to determine the DISKS value, if we using RAID5(12 physical disks),is the DISKS=1 or DISKS=12?

According to with less DISKS installed on the host, the formula

# of containers = min (2*CORES, 1.8*DISKS, (total available RAM) / MIN_CONTAINER_SIZE)

the # of containers must be determined by the DISKS, such as 1.8 , so to avoid the waste of memory , need I set the

yarn.scheduler.minimum-allocation-mb

to a larger value?

I want to more containers to support more concurrent applications, how to make the cluster with more containers if the physical disks on hosts are small?

Confused by the memory, cpu , DISKS and # of containers and concurrent running applications.

Thanks

3 REPLIES 3

Re: yarn-utils.py and DISKS

Super Collaborator

For 2.6.1, the documentation (https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_command-line-installation/content/determine-hdp-memory-config.html) points to the link below, which apparently doesn't exist.

Can anyone point me in the right direction?

http://public-repo-1.hortonworks.com/HDP/tools/2.6.1.0/hdp_manual_install_rpm_helper_files-2.6.1.0.1...

Re: yarn-utils.py and DISKS

Super Collaborator

Re: yarn-utils.py and DISKS

New Contributor

It's not entirely clear to me either what's the right value of DISKS, especially in the RAID case. The documentation you link* says:

DISKS is the value for *dfs.datanode.data.dir* (number of data disks) per machine.


* https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-installation/content/determ...

HDP v2.5 - dfs.datanode.data.dir

HDP v2.6 - dfs.data.dirs

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_command-line-installation/content/determ...