Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Moving files from HDFS to s3

avatar
New Contributor

Hello,

I am trying to move data from hadoop cluster to aws s3 bukcet using  

but even a file of size 4MB takes a lot a time and i am unable to understand the bottle neck here.

I am running the below command aws s3 cp <source_folder> s3://<path>/ --recursive

 

The file size is 4MB and the upload speed is 50-60 kib/sec. 

I talked to aws side as well according to them the issue maybe due to networking is the client end ie the cloundera hadoop cluster 

Can anyone help understand how can i move my data from hadoop hdfs to aws s3 efficiently ?

2 REPLIES 2

avatar
Master Collaborator

Hello @shadma-1 

 

We have one experience in CDP  to achieve this named as Replication Manager, with the help of it you can migrate your HDFS/HIVE/HBase data to S3 or Azure.

 

Please refer to the below official link for your reference, also if have any difficulties you can reach out to the support team as well.

 

1- Cloudera Replication Manager 

2- Introduction to Replication Manager 

 

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Master Collaborator

Hello @shadma-1 

 

Just wanted to check if you have any further queries related to the replication manager.

 

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button 🙂