Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

if datanode goes down what admin will have do.

avatar
Contributor
 
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Mohammad Shamim

If there are many DataNodes then one of the DN will not cause much harm, as there is Replication Factor so the data will be having replicas on other DNs.

However it will be good to investigate why the DN went down. So looking at the DN log, Garbage Collection log can help.

Sometimes JVM crash also can cause DN to go down.

Immediate admin task will be to attempt to bring the DataNode. Then admin can start looking at the DN log, Garbage Collection to see why the DN went down. (in case of DataNode jvm crash the "hs_err_pid" file is generated usually, that can be reviewed).

.

From Ambari 2.5 onwards there is a new feature called as "Service Auto Start" then you might want to look at: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/ch07s04.html

Enabling auto-start for a service causes the ambari-agent to attempt re-starting service components in a stopped state without manual effort by a user. 

.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

@Mohammad Shamim

If there are many DataNodes then one of the DN will not cause much harm, as there is Replication Factor so the data will be having replicas on other DNs.

However it will be good to investigate why the DN went down. So looking at the DN log, Garbage Collection log can help.

Sometimes JVM crash also can cause DN to go down.

Immediate admin task will be to attempt to bring the DataNode. Then admin can start looking at the DN log, Garbage Collection to see why the DN went down. (in case of DataNode jvm crash the "hs_err_pid" file is generated usually, that can be reviewed).

.

From Ambari 2.5 onwards there is a new feature called as "Service Auto Start" then you might want to look at: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/ch07s04.html

Enabling auto-start for a service causes the ambari-agent to attempt re-starting service components in a stopped state without manual effort by a user. 

.

avatar
Contributor

where we will find the DN logs and Garbage Collection.

avatar
Master Mentor

@Mohammad Shamim

Normally you will find the DataNode logs (GC log as well) inside the following directory

/var/log/hadoop/hdfs/

But if you (or other admin) have customized the path then you can find the log directory path in the following command output on the DataNode:

# ps -ef | grep DataNode

.

like the following parameters in the "ps" command output:

like "-Dhadoop.log.dir=/var/log/hadoop/hdfs" and "-Xloggc:/var/log/hadoop/hdfs/gc.log-201707171201"

.

avatar
Contributor

thanks for your revert but could you tell me on which node will find the logs..

avatar
Master Mentor
@Mohammad Shamim

On every datanode you will find the relevant logs in the mentioned path. You will need to look at the DataNode host which went down.