Agreed, but is there a way to avoid this wastage. apart from migrating data to LFS and then again to HDFS.
Example: We have a 500MB file with block size 128 MB i.e. 4 blocks on HDFS. Now since we changed block size to 256MB, how would we make the file on HDFS to have 2 blocks of 256MB instead of 4.
Thanks a lot for your time.
Yes, you are correct and I am looking for a tool other than distcp
Thanks a lot for your time on this again.
Nice to know it has answered your question.
Could you Accept the answer I gave by Clicking on Accept button below, That would be a great help to Community users to find the solution quickly for these kinds of errors.
This is in regard to the changing the block size of an existing file from 64mb to 128mb.
We are facing some issues when we delete the files.
So, is there a way to change the block size of an existing file, without removing the file.
Thanks for the details.
I have a cluster which has 220 million files and out of which 110 million is less than 1 MB in size.
Default block size is set to 128 MB.
What should be the blocksize for file less than 1 MB? And How we can set in live cluster?
Total Files + Directories: 227008030
Disk Remaining: 700 TB / 3.5 PB (20%)