Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to copy data from HDFS to S3

Highlighted

Unable to copy data from HDFS to S3

Explorer

I am trying to copy data from hdfs to s3 While using the distcp command, the command works for individual files.

 

So, hadoop distcp /user/username/file.txt s3a://xxxxx works fine

 

But when I try to copy the entire director structure it fails to create the directory giving the error: Error: java.io.IOException: mkdir failed for s3a://bucket****/ Error Code: 403 Forbidden; Request ID: 447400E9C5995ED9), S3 Extended Request ID: T0hsw+XaBMrkMUhDcJBKGIRtSF58dKedZdCH2qC32v9uVkwR94SGiI7Xxe8lqaFaDyjwS3oCpkg=

 

Even if I do a simple mkdir command in s3 it fails giving the same 403 forbidden issue.

So not sure what is the root cause.

 

I am able to copy the files but not able to create directories.

3 REPLIES 3

Re: Unable to copy data from HDFS to S3

Cloudera Employee

You may not have all the permissions you need on the bucket.  Make sure you have "s3:ListBucket" on resource "arn:aws:s3:::<bucket-name>", as well as "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:PutObjectAcl" on resource "arn:aws:s3:::<bucket-name>/*"

 

You probably also want to add the permissions for creating and deleting multipart uploads. I'd add priviliges for the operations mentioned in the multipart upload docs as well.

Re: Unable to copy data from HDFS to S3

New Contributor

How did you copy hdfs files to s3 ? using hadoop Distcp ?

can u please tell me what configuration did you do  ? other than passing access and secret key in core-site.xml 

Also can u please tell me if you know ? how to sync directories from hdfs to s3 like below :

hdfs://home/test/abc.txt      ----> s3://bucket/test/abc.txt

 

Re: Unable to copy data from HDFS to S3

Community Manager

Hi @VenkateshB,

 

You may want to take a look at the doc here where provides some distcp examples:

https://www.cloudera.com/documentation/enterprise/6/latest/topics/cdh_admin_distcp_data_cluster_migr...

 

Thanks and hope it helps,

Li

Li Wang, Technical Resolution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Don't have an account?
Coming from Hortonworks? Activate your account here