Member since
12-15-2021
3
Posts
1
Kudos Received
0
Solutions
01-06-2023
01:28 PM
1 Kudo
Hi, I am using distcp to copy data from hadoop hdfs to s3. below is the shorthand command of what i use hadoop distcp -pu -update -delete hdfs_path s3a://bucket recently got into an issue with the below case i have a file in hdfs -> temp_file with data 1234567890 with size 27kb for the first time when i use distcp. it pushes the file to s3 bucket without any issue. second time i update the same file temp_file with different content abcdefghij but with same size 27kb now when i run distcp. instead of checking the checksum of source and target distcp skips the file directly and doesnt copy the updated file from hdfs to s3 Am i missing any options in distcp command to make this scenario work?
... View more
Labels:
12-15-2021
06:06 AM
I am also facing this issue. Were you able to resolve this?
... View more