- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Moving files from HDFS to s3
- Labels:
-
Cloudera Data Engineering (CDE)
Created 01-31-2022 10:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to move data from hadoop cluster to aws s3 bukcet using
but even a file of size 4MB takes a lot a time and i am unable to understand the bottle neck here.
I am running the below command aws s3 cp <source_folder> s3://<path>/ --recursive
The file size is 4MB and the upload speed is 50-60 kib/sec.
I talked to aws side as well according to them the issue maybe due to networking is the client end ie the cloundera hadoop cluster
Can anyone help understand how can i move my data from hadoop hdfs to aws s3 efficiently ?
Created 03-17-2022 12:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @shadma-1
We have one experience in CDP to achieve this named as Replication Manager, with the help of it you can migrate your HDFS/HIVE/HBase data to S3 or Azure.
Please refer to the below official link for your reference, also if have any difficulties you can reach out to the support team as well.
1- Cloudera Replication Manager
2- Introduction to Replication Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created 03-24-2022 04:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @shadma-1
Just wanted to check if you have any further queries related to the replication manager.
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button 🙂
