Created 06-22-2017 07:17 PM
Most of the jobs are running slow after I added a datanode which previously had some hardware issues, now corrected. But everytime ( 2 times i added) i add it the cluster, the jobs are running slow. I checked I/O rate with another server and it looks good too.. Please advise.
Thanks.
Created 06-22-2017 07:22 PM
Have you confirmed if there are containers being run on this node(and non local reads) thats causing job to be slow? If thats the case I would recommend to install only 'datanode' process first and once cluster is balanced (maybe after day) add 'nodemanager' process to run containers on the node.
Created 06-22-2017 07:33 PM
Yes, i have node manager running as well. And yes, seems like containers are taking longer time. Can you tell me in what way it will help when i balance my cluster and then add nodemanager to it?
Created 06-22-2017 07:43 PM
We want to avoid non-local reads of data as much as possible for best performance. Details here:
http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html#maptask-launch
Created 06-23-2017 03:07 PM
I will try that and let you know how it works. Thanks.
Created 06-27-2017 01:42 PM
I did load balancing and then realized one of the disks has I/O error out of nowhere .... so that was the issue.