I have a cluster of 11 nodes, out of which 2 are masters, 4 datanode+nodemanager, 5 nodemanager. Each node in the cluster has 32 cores. I was running the testdfsio benchmark with 8tb load spanning 10000 files each of 800MB data. While I was running this job, I observed in resoucemanager web UI that the 5 slaves with nodemanager running on them started their 32 containers each and executed 32 map tasks each and stopped taking any map tasks and no containers from these nodes were allocated for the job. Only the 4 slaves with dn and nm running on them were taking all the rest of the map tasks for execution. Why is it happening like this?
... View more
I have configured a cluster with CDH 5.8.0. I am using CentOS 7.2. Cluster has 11 nodes out of which 2 are master nodes, 4 DN (datanode) +NM (nodemanager), 5 NM nodes. I have 32 core CPU available on each node. I was trying to execute TestDFSIO job on this cluster. When the job starts, 15 containers each from the 4 nodes having DN+NM on them start, 32 containers each from the 5 nodes having NM running on them start. On total, 220 containers start, after executing 32 map tasks each, the 5 nodes do not take up any other map task. All other map tasks are being executed by the 4 nodes with DN+NM running on them. Why is this happening?
... View more