Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎02-03-2017

HDFS disk usage doubled after upgrade

 

I've done an upgrade to Cloudera Manager from 5.5.3 to 5.10.0 then upgraded CDH from 5.5.1 to 5.8.4. After these operations, I saw the disk usages of all DataNodes on Hosts->All Hosts page increased. On HDFS file browser and with CLI commands I see almost every directory has double the size before, but I noticed no difference among the file counts, types, names etc.. Same thing when I also check disk usage on Linux terminal. I am a little bit confused and need help to figure out what happened.

Expert Contributor
Posts: 170
Registered: ‎05-16-2016

Re: HDFS disk usage doubled after upgrade

[ Edited ]

Curious to know  whether Reinstalling the same Cloudera Manager Server version that you were previously running

solved the issue ? 

New Contributor
Posts: 5
Registered: ‎02-03-2017

Re: HDFS disk usage doubled after upgrade

I haven't tried that, and probably would not be able to.
New Contributor
Posts: 5
Registered: ‎02-03-2017

Re: HDFS disk usage doubled after upgrade

@csguna

 I haven't tried that, and probably would not be able to.

 

 

New Contributor
Posts: 5
Registered: ‎02-03-2017

Re: HDFS disk usage doubled after upgrade

[ Edited ]

An update: I was mistaken on some values.

 

The size values on HDFS file browser and returning from hdfs dfsadmin -report are supposed values. But Cloudera metrics & charts countinue to give increasing values. du -sch output on dfs folders in Linux terminal also gives big numbers. And I noticed the increase have started a couple of days before the upgrade I mentioned, so it's not likely something went wrong with the upgrade.

 

Recently we have been informed by another HDFS user that they have been splitting the large files into smaller ones for computing performance increase(??) which had me thinking if they're splitting the combined size of TBs of data into smaller ones mostly even smaller than Block Size (128MB) and causing usage on the file system grow more than 3x.

 

Am I correct on this estimation?

Announcements