Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

distcp error: Mismatch in length of source

Highlighted

distcp error: Mismatch in length of source

Expert Contributor

Distcp is failing to copy csv files with the error "Mismatch in length of source" or "Cannot obtain block length". I find that some of the files it's complaining about are zero-sized files. I am copying from HDP2.4.3 to HDP2.6. See the command i'm running below and find attached the logs. Anyone encountered this same problem?

hadoop distcp -bandwidth 1 -m 24 -i hdfs://nn1:8020/dir/year=2017/month=05/day=14 hdfs://nn2:8020/dir/day=20170514
1 REPLY 1

Re: distcp error: Mismatch in length of source

New Contributor

@Joshua Adeleke

run this command 'hdfs fsck /path/to/file' and check that all the file blocks of this file is not corrupted