Created 06-12-2017 03:54 AM
Hi ,
I started to get today this message when trying to check folder size in HDFS .
hadoop fs -du -s -h /data/
du: java.util.ConcurrentModificationException
using cloudera 5.11 .
The only thing that happen yesterday was deleting huge ammount of data from the HDFS .
Note - I'm using hadoop snapshots to backup the HDFS may it be related ?
Did not happen on CDH 5.4.3 with snapshots as well .
Any help will be appreciated.
Thanks
Alon
Created 06-13-2017 12:13 PM
Hey Alon,
You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release.
It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running.
To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports.
Created 06-13-2017 12:13 PM
Hey Alon,
You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release.
It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running.
To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports.
Created 06-14-2017 12:24 AM
Thanks Borg ,
It happaned after deleteing huge ammount of data from the HDFS .
To reslove for the meatim I deleted all snapshots and created new one .
Its working OK .
Waiting for 5.11.1 🙂
Many Thanks
Alon