Created 03-02-2025 11:51 PM
Hi,
Currently i have 1 TB of data in HDFS where i am trying to migrate into S3, i am using below command, when ever i run this command job runs very fast for 3 hours then it slows down for a week still it is running, i started last week to run this job still it is running and very slow, is this expected behavior.
nohup hadoop distcp -Dfs.s3a.access.key="$AWS_ACCESS_KEY_ID" -Dfs.s3a.secret.key="$AWS_SECRET_ACCESS_KEY" -Dfs.s3a.fast.upload=true -Dfs.s3a.fast.buffer.size=1048576 -Dfs.s3a.multipart.size=10485760 -Dfs.s3a.multipart.threshold=10485760 -Dmapreduce.map.memory.mb=8192 -Dmapreduce.map.java.opts=-Xmx7360m -m=300 -bandwidth 400 -update hdfs:<....> s3a://<.......>
Created 03-04-2025 02:36 AM
@Zubair123, Welcome to our community! To help you get the best possible answer, I have tagged in our HDFS experts @willx @ChethanYM who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Created 03-04-2025 10:10 AM
You may want to collect yarn application log to understand what happened after 3 hours, for example, it may be a yarn resource issue or stuck containers.
1. Open console debug log and re-run distcp and save the output
export HADOOP_ROOT_LOGGER=DEBUG,console
nohup hadoop distcp -Dfs.s3a.access.key="$AWS_ACCESS_KEY_ID" -Dfs.s3a.secret.key="$AWS_SECRET_ACCESS_KEY" -Dfs.s3a.fast.upload=true -Dfs.s3a.fast.buffer.size=1048576 -Dfs.s3a.multipart.size=10485760 -Dfs.s3a.multipart.threshold=10485760 -Dmapreduce.map.memory.mb=8192 -Dmapreduce.map.java.opts=-Xmx7360m -m=300 -bandwidth 400 -update [hdfs path] [s3a path] > distcp_console.out 2>&1 &
2. Collect yarn application logs:
yarn logs -applicationId [applicationID] > /tmp/distcp_application.out
3. If there are stuck yarn containers, collect jstack of the container pid, refer to below post
Created 03-11-2025 10:23 AM
@willx i really appreciate for response, looks like i don't have an access to the Article.
Can you please share the solution i really appreciate for help.
Thanks,
Zubair.
Created 03-10-2025 11:04 PM
@Zubair123, Did the response help resolve your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future.
Regards,
Vidya Sargur,Created 03-11-2025 10:23 AM
@VidyaSargur i dont have an access to the article waiting for share solution.
Created 03-12-2025 10:57 PM
@Zubair123, This article is available exclusively for our customers. If you're a customer, please contact our customer support team for more details. If you’re not, our sales team would happily assist you with any information you need.
Regards,
Vidya Sargur,