Support Questions

shadma-1 · ‎01-31-2022

Hello,

I am trying to move data from hadoop cluster to aws s3 bukcet using

but even a file of size 4MB takes a lot a time and i am unable to understand the bottle neck here.

I am running the below command aws s3 cp <source_folder> s3://<path>/ --recursive

The file size is 4MB and the upload speed is 50-60 kib/sec.

I talked to aws side as well according to them the issue maybe due to networking is the client end ie the cloundera hadoop cluster

Can anyone help understand how can i move my data from hadoop hdfs to aws s3 efficiently ?

shehbazk · ‎03-17-2022

Hello @shadma-1

We have one experience in CDP to achieve this named as Replication Manager, with the help of it you can migrate your HDFS/HIVE/HBase data to S3 or Azure.

Please refer to the below official link for your reference, also if have any difficulties you can reach out to the support team as well.

1- Cloudera Replication Manager

2- Introduction to Replication Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

shehbazk · ‎03-24-2022

Hello @shadma-1

Just wanted to check if you have any further queries related to the replication manager.

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button 🙂

Cloudera Community

Support Questions

Moving files from HDFS to s3

Get files recursively from S3 bucket

Comparing Performance of Cloudera Operational Data...

How to access data files stored in AWS S3 buckets ...

NiFi: How to detect updates to S3 files and insert...

How to Move or Change HDFS DataNode Directories

Move file from one HDFS directoy to another using ...

Best Practices: Linux File Systems for HDFS

Trying to move files via ExecuteStreamCommand

problem in moving the input text file from file sy...

How to copy HDFS file to AWS S3 Bucket? hadoop di...