Created 03-27-2017 07:46 AM
I am facing checksum mismatch while copying data between two cluster.
Caused by: java.io.IOException: Check-sum mismatch. Also is there a way to ignore checksum through falcon UI? . Thanks in advance.
Check-sum mismatch between hdfs://test1:8020/raw/ech/incr/row_bn... and hdfs://test2:8020/raw/ech/.distcp.tmp.attempt_1471351807373_0.... Source and target differ in block-size.
Created 03-27-2017 08:26 AM
It appears that block size is different in your two clusters . You can set flags -preserveBlockSize or -skipChecksum as below .
1. Suspend all Falcon jobs . 2. Modify template for falcon mirroring at /usr/hdp/current/falcon-server/data-mirroring/workflows/hdfs-replication-workflow.xml .Add the following argument at the end in argument list to preserve the block size . <arg>-preserveBlockSize</arg> <arg>true</arg> 3. Restart Falcon through Ambari. 4. Resubmit the job and verify if the HDFS mirror job is working fine now.
This will set this property for all mirror jobs.
Created 03-27-2017 08:26 AM
It appears that block size is different in your two clusters . You can set flags -preserveBlockSize or -skipChecksum as below .
1. Suspend all Falcon jobs . 2. Modify template for falcon mirroring at /usr/hdp/current/falcon-server/data-mirroring/workflows/hdfs-replication-workflow.xml .Add the following argument at the end in argument list to preserve the block size . <arg>-preserveBlockSize</arg> <arg>true</arg> 3. Restart Falcon through Ambari. 4. Resubmit the job and verify if the HDFS mirror job is working fine now.
This will set this property for all mirror jobs.
Created 03-27-2017 09:55 AM
Thanks @prsingh, able to run the job after the change