Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hadoop -du retrun du: java.util.ConcurrentModificationException

avatar
Contributor

Hi ,

 

I started to get today this message when trying to check folder size in HDFS .

 

hadoop fs -du -s -h /data/
du: java.util.ConcurrentModificationException

 

using cloudera 5.11 .

 

The only thing that happen yesterday was deleting huge ammount of data from the HDFS .

 

Note - I'm using hadoop snapshots to backup the HDFS may it be related ?

 

Did not happen on CDH 5.4.3 with snapshots as well .

 

Any help will be appreciated.

 

Thanks

 

Alon

1 ACCEPTED SOLUTION

avatar
Rising Star

Hey Alon, 

 

You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release. 

 

It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running.

 

To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports. 

 

 

 

View solution in original post

2 REPLIES 2

avatar
Rising Star

Hey Alon, 

 

You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release. 

 

It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running.

 

To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports. 

 

 

 

avatar
Contributor

Thanks Borg ,

 

It happaned after deleteing huge ammount of data from the HDFS .

 

To reslove for the meatim I deleted all snapshots and created new one .

 

Its working OK .

 

Waiting for 5.11.1 🙂

 

Many Thanks

 

Alon