Support Questions

Find answers, ask questions, and share your expertise

3 NodeManagers started, while only 1 is live

Contributor

I installed Hadoop cluster with Ambari. it contains 1 master node and 2 slaves. DataNode and NodeManager are installed on each instance. Totally, I have 3 DataNodes and 3 NodeManagers. In Ambari UI I have noticed that all 3 DataNodes are alive, while only 1 NodeManager is alive (though all 3 are started). Please see the attached screenshot.

I tried to add this setting to Custom hdfs-site and restarted everything, but still get the same issue:

dfs.namenode.rpc-bind-host = 0.0.0.0

72616-screen-shot-2018-05-05-at-162217.png

3 REPLIES 3

Expert Contributor

@Liana Napalkova Click on hosts, and try to start the nodemanagers. If you can provide the output of the error message.(If there is one) that would be really helpful.

To do this, either click on the operation and click through the the output/error message. (once you click start the number of pending operations should change to 1, click on this and it will enable you to get more info.. just keep clicking through to the log.)

If that's not clear you can alway grab the log, ssh into the server and look in the following directory /var/log/[path to log] (I dont' recall the exact path off the top of my head but all logs for hdp are in /var/log/ so you should be able to find it with a little bit of looking.)

Hope this helps.

Matt

Mentor

@Liana Napalkova

Check space utilization on the node and the disk where we are getting no report in Ambari UI and compare it to the value for

yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage 

Check the disk space it should be less than 90 the default percentage of the above parameter

# df -h 

Restart the Ambari-agents on the offending nodes, preferably kill the current pid and restart

# ambari-agent start 

Shift through the below logs to see any pointers

# /var/log/hadoop/hdfs/hadoop-hdfs-namenode-{FQDN}.out/log

Please revert

If the Nodemanager process is running in the machine. Nodemanager log should point us whats actually happening there.