Support Questions
Find answers, ask questions, and share your expertise

HDFS + how to limit the hdfs du -sk that runs periodically and causes high machine load

we have hadoop cluster with 374 datanode machine ( HDP version 2.6.5 with ambari 2..6.2.2 ) each datanode have 12 disks when each disk is 15T

we noticed that du -sk is running periodically from hdfs user on each data-node server

we understand that using of the du -sk command is to count the size of BP.

but this operation will take a long time (maybe more than 10-30 minutes , since we have each disk with size of 15T)

so on a large hard disk machine. This will lead to an increase in iowait and load.

is there any option to disable the running of du ? or at least to find the configuration that runs the du and re-configure it so du will runs each 3 hours instead of each couple min?