Support Questions

Find answers, ask questions, and share your expertise

Worker node on a unhealthy state

avatar
Contributor

Hi,

I am getting the following message on ambari on every worker nodes:

1/1 local-dirs usable space is below configured utilization percentage/no more usable space [ /data/2/hadoop/yarn/local : used space above threshold of 90.0% ] ;


I have also checked on the ResourceManager UI > Nodes. 6 nodes are currently on a unhealthy state.

Could you please help me get them back to a healthy state?


Thank you

1 REPLY 1

avatar

Hi @Koffi ,

 

yarn.nodemanager.log-dirs is where User Jobs Containers write the stdout, stderr and syslogs. Yarn has log aggregation which copies the log files from NodeManager local directory to HDFS and remove them.

 

In yarn-site.xml look for below property in yarn configs in ambari  (the below is from my property)
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/var/log/hadoop/yarn/log</value>
</property>

 

And make sure dedicated disk is allocated. And one more chance of disks getting full is if more parallel containers running on a NodeManager and writing too much of intermediate data.

 

The most common cause of log-dirs are bad is due to available disk space on the node exceeding yarn's max-disk-utilization-per-disk-percentage default value of 90.0%.
Either clean up the disk that the unhealthy node is running on, or increase the threshold in yarn-site.xml.

 

if you are more interested, please read : https://blog.cloudera.com/resource-localization-in-yarn-deep-dive/