Created 08-24-2016 03:37 PM
distcp-error.txtthis command is failing and I am unable to find the reason looking at the log file ,please help identify the issue, log file attached.
hadoop distcp hdfs:///user/sami/ hdfs:///user/zhang
Created 08-24-2016 04:30 PM
The following line seems to indicate the issue:
Caused by: java.io.IOException: Check-sum mismatch between hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/sami/error1.log and hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/zhang/.distcp.tmp.attempt_1472051594557_0001_m_000001_0. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)
Is the block size set differently between the source and target clusters?
Created 08-24-2016 05:23 PM
the source and target clusters? I am using same node ..hadoop1 . so I guess the block size would be same.
how can I check all this?
Created 08-24-2016 05:55 PM
so you have two clusters on same node? Is it possible that two clusters have different block size settings? Can you please verify dfs.blocksize setting on both clusters?
Created 08-24-2016 06:09 PM
Created 08-25-2016 02:36 PM
wrong syntax . "-cp doesn't exist"
Created 08-29-2016 01:35 AM
Created 08-25-2016 05:28 AM
If i'm not wrong , you are trying to copy the data within same cluster to different directories.
You can simply use the copy command.
hadoop fs -cp hdfs:///user/sami/ hdfs:///user/zhang
Created 08-25-2016 02:36 PM
I want to use distcp for learning purposes.
Created 08-25-2016 03:48 PM