05-07-2018 08:47 AM
There was an error in file descriptor and block count. it is occuring in almost all datanodes, some datanodes were bad and critical.
Below are the errors we are facing.
1) Concerning : The DataNode has 1,142,375 blocks. Warning threshold: 500,000 block(s). ( on what bases the Block count is counted we have 10 DN each 50TB).
2)Concerning :Bad : Open file descriptors: 23,576. File descriptor limit: 32,768. Percentage in use: 71.95%. Critical threshold: 70.00%. (Is it okay if we increase the threshold value for warining: 75% and critical for 85%).
The Ulimit -n 32768.
Please suggest me the solution and let me know if you need more information.
Thanks & Regards,
05-08-2018 03:45 AM
It is based on the below configurations
1. CM -> HDFS -> Configuration -> DataNode Block Count Thresholds -> The default value is 500,000
2. CM -> HDFS -> Configuration -> HDFS Block Size -> It may be 64 MB or 128 MB or 256 MB
As a solution, you can try the below
1. CM -> HDFS -> Action -> Rebalance
2. Increase the "DataNode Block Count Thresholds" value based on your capacity. NOTE: it require the service restart
3. In some environments, you can also ignore this warning unless you are really running out of space
The above block count is related to datanodes and file descriptor is for other daemons like namenode, secondary namenode, journal node, etc. I never explored the file descriptor before, so i don't want to comment