Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is there an -ignoreCrc equivalent when using getmerge?

avatar
Rising Star

When copying files from HDFS to a local file system:

hdfs dfs -copyToLocal <source> <dest>

you have options -crc and -ignoreCrc to turn the checksum files on/off.

I am merging/copying out to local using

hdfs dfs -getmerge <sourceDir> <destFile>

and end up with a hidden .destFile.crc file for each destFile.

Is there an equivalent way to turn this function off, or otherwise automatically remove the .destFile.crc if the corresponding destFile is deleted (from the local file system)?

Thank you!

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
5 REPLIES 5

avatar
Master Mentor

@Emily Sharpe

I don't see any option in -getmerge. I think you may want to write a shell script to remove .crc files from a particular location. Something like the following. You can run a cron to execute that

find . -type f -name '*.crc' -exec rm {} +

avatar
Rising Star

Hi @Neeraj Sabharwal, than you for the script line - looks like i will be adding that in!

avatar
Contributor

Hi @Neeraj Sabharwal,

I am trying to save my output results in Spark using saveAsTextFile(""). The result of which is multiple parts (part-0000, part-00001 ...so on) along with .crc files in the output directory. Do you have any idea how can I avoid forming the .crc files?

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

Hi @Chris Nauroth thanks for the confirmation, and great to know the option has been suggested 🙂