Created 01-03-2019 05:00 PM
hi all
we saw few scenarios that disks on datanode machine became full 100%
because the files - stdout are huge
for example
/grid/sdb/hadoop/yarn/log/application_151746342014_5807/container_e37_151003535122014_5807_03_000001/stdout
from df -h , we can see
df -h /grid/sdb Filesystem Size Used Avail Use% Mounted on /dev/sdb 1.8T 1.8T 0T 100% /grid/sdb
any suggestion how to avoid this situation that stdout are huge and actually this issue cause stopping the HDFS component on the datanode
second: since the PATH of stdout is:
/var/log/hadoop-yarn/containers/[application id]/[container id]/stdout
is it possible to limit the file size? or do a purging of stdout when file reached the threshold ?
Created 01-03-2019 11:30 PM
What is the value set for "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage" property for your yarn?
-> This is the maximum percentage of disk space utilization allowed after which a disk is marked as bad. Values can range from 0.0 to 100.0. If the value is greater than or equal to 100, the nodemanager will check for full disk. This applies to yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs. The default value is 90.0%. Hence either clean up the disk that the unhealthy node is running on, or increase the threshold in yarn-site.xml
Ambari --> YARN -> Configs -> Advanced yarn-site -> Check "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage" .
"yarn.nodemanager.log-dirs": It is always best to make sure dedicated disk is allocated you can check the path of the property "yarn.nodemanager.log-dirs" and move it to dedicated disk where enough space is available. This property Determines where the container-logs are stored on the node when the containers are running.
Also please check the property "yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb" value. It's default value is 0. The minimum space that must be available on a disk for it to be used. This applies to yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs.
Created 01-04-2019 06:31 AM
@Jay the value is yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage=90
Created 01-04-2019 06:34 AM
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb=1000M in my ambari
Created 01-04-2019 06:35 AM
@Jay just to mention that , we want to limit the size of stdout or stderr , is it possible ? , lets say for example if we want to limit the size until 1G per file ,
Created 01-05-2019 06:58 PM
@Jay any update?