Support Questions

shadma-1 · ‎01-31-2022

Hello,

I am trying to move data from hadoop cluster to aws s3 bukcet using

but even a file of size 4MB takes a lot a time and i am unable to understand the bottle neck here.

I am running the below command aws s3 cp <source_folder> s3://<path>/ --recursive

The file size is 4MB and the upload speed is 50-60 kib/sec.

I talked to aws side as well according to them the issue maybe due to networking is the client end ie the cloundera hadoop cluster

Can anyone help understand how can i move my data from hadoop hdfs to aws s3 efficiently ?

shehbazk · ‎03-17-2022

Hello @shadma-1

We have one experience in CDP to achieve this named as Replication Manager, with the help of it you can migrate your HDFS/HIVE/HBase data to S3 or Azure.

Please refer to the below official link for your reference, also if have any difficulties you can reach out to the support team as well.

1- Cloudera Replication Manager

2- Introduction to Replication Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

shehbazk · ‎03-24-2022

Hello @shadma-1

Just wanted to check if you have any further queries related to the replication manager.

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button 🙂

Cloudera Community

Support Questions

Moving files from HDFS to s3