Created 08-18-2016 06:10 AM
Documentation says, map tasks are run in DataNodes and have data locality constraints which the scheduler tries to honor and reduce tasks can run anywhere in the cluster. The statement "can run anywhere in the cluster" for reduce tasks, is referring to only DataNodes in the cluster OR is ResourceManager machine also considered part of the cluster, so that Reduce tasks are allowed to run in ResourceManager also ??
Created 08-18-2016 06:21 AM
reduce tasks "can run anywhere in the cluster" means on any of the node which has "Node Manager" installed on it
Created 08-18-2016 06:21 AM
reduce tasks "can run anywhere in the cluster" means on any of the node which has "Node Manager" installed on it
Created 08-18-2016 06:22 AM
"can run anywhere in the cluster" - Meaning anywhere where nodemanagers have been deployed, it can be any slave node or master node if nodemanager is installed on master nodes.
Hope this information helps!
Please do let me know if you have any further question.
Created 08-29-2016 01:31 AM
@Fasil Ahamed - Can you please accept the appropriate answer?