Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS Backup to AWS S3 without Keys error

Solved Go to solution

HDFS Backup to AWS S3 without Keys error

Expert Contributor

Dear All,

We have been using hadoop distcp to backup hdfs data to AWS S3 via script in crontab & we have been using AWS keys with the distcp command to do the backup. Without AWS keys also it works but some times we are getting the timeout error and not reliable.

Is it mandatory to use the AWS keys along with hadoop distcp command or not? If not why i was getting the timeout/socket errors when i run without AWS keys? Manually tested few times and same result.

Command:

With Keys

hadoop distcp -Dfs.s3a.server-side-encryption-algorithm=AES256 -Dfs.s3a.access.key=${AWS_ACCESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY} -update hdfs://< HDFS dir>/ s3a://${BUCKET_NAME}/

Without Keys

hadoop distcp -Dfs.s3a.server-side-encryption-algorithm=AES256 -update hdfs://< HDFS dir>/ s3a://${BUCKET_NAME}/

Below is the error we get while running with out AWS keys.

""dfs.sh_20160630_010001:com.amazonaws.AmazonClientException: Unable to upload part: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 3C1FD2E8F503F052, AWS Error Code: RequestTimeout, AWS Error Message: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed."
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: HDFS Backup to AWS S3 without Keys error

Expert Contributor

Hi All,

At AWS end we need to provide the appropriate permissions for role based aws s3 backup. Now it is working.

Thank you for your valuable comments.

3 REPLIES 3

Re: HDFS Backup to AWS S3 without Keys error

@Muthukumar S : You need to either add the aws keys in the hadoop command or permanently add them in core-site.xml.

Are you able to do a hadoop fs -ls s3a://${BUCKET_NAME}/ [feel free to add keys accordingly] (This is to isolate authentication and connectivity issue)?

Re: HDFS Backup to AWS S3 without Keys error

Expert Contributor

@Sandeep Nemuri

My requirement is below. I want to omit keys for Role based Authentication. Now AWS instance is assigned with the role but my hadoop distcp is not working if I provide the command without keys.

<property>
  <name>fs.s3a.access.key</name>
  <description>AWS access key ID. Omit for Role-based authentication.</description>
</property>

<property>
  <name>fs.s3a.secret.key</name>
  <description>AWS secret key. Omit for Role-based authentication.</description>
</property>
Highlighted

Re: HDFS Backup to AWS S3 without Keys error

Expert Contributor

Hi All,

At AWS end we need to provide the appropriate permissions for role based aws s3 backup. Now it is working.

Thank you for your valuable comments.