- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
hadoop distcp command failing
- Labels:
-
Apache Hadoop
Created ‎08-24-2016 03:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
distcp-error.txtthis command is failing and I am unable to find the reason looking at the log file ,please help identify the issue, log file attached.
hadoop distcp hdfs:///user/sami/ hdfs:///user/zhang
Created ‎08-24-2016 04:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following line seems to indicate the issue:
Caused by: java.io.IOException: Check-sum mismatch between hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/sami/error1.log and hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/zhang/.distcp.tmp.attempt_1472051594557_0001_m_000001_0. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)
Is the block size set differently between the source and target clusters?
Created ‎08-24-2016 05:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the source and target clusters? I am using same node ..hadoop1 . so I guess the block size would be same.
how can I check all this?
Created ‎08-24-2016 05:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so you have two clusters on same node? Is it possible that two clusters have different block size settings? Can you please verify dfs.blocksize setting on both clusters?
Created ‎08-24-2016 06:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎08-25-2016 02:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
wrong syntax . "-cp doesn't exist"
Created ‎08-29-2016 01:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎08-25-2016 05:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If i'm not wrong , you are trying to copy the data within same cluster to different directories.
You can simply use the copy command.
hadoop fs -cp hdfs:///user/sami/ hdfs:///user/zhang
Created ‎08-25-2016 02:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to use distcp for learning purposes.
Created ‎08-25-2016 03:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
