Created 09-27-2016 06:42 AM
Dear All,
We have been using hadoop distcp to backup hdfs data to AWS S3 via script in crontab & we have been using AWS keys with the distcp command to do the backup. Without AWS keys also it works but some times we are getting the timeout error and not reliable.
Is it mandatory to use the AWS keys along with hadoop distcp command or not? If not why i was getting the timeout/socket errors when i run without AWS keys? Manually tested few times and same result.
Command:
With Keys
hadoop distcp -Dfs.s3a.server-side-encryption-algorithm=AES256 -Dfs.s3a.access.key=${AWS_ACCESS_KEY_ID} -Dfs.s3a.secret.key=${AWS_SECRET_ACCESS_KEY} -update hdfs://< HDFS dir>/ s3a://${BUCKET_NAME}/
Without Keys
hadoop distcp -Dfs.s3a.server-side-encryption-algorithm=AES256 -update hdfs://< HDFS dir>/ s3a://${BUCKET_NAME}/
Below is the error we get while running with out AWS keys.
""dfs.sh_20160630_010001:com.amazonaws.AmazonClientException: Unable to upload part: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 3C1FD2E8F503F052, AWS Error Code: RequestTimeout, AWS Error Message: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed."
Created 10-13-2016 04:18 AM
Hi All,
At AWS end we need to provide the appropriate permissions for role based aws s3 backup. Now it is working.
Thank you for your valuable comments.
Created 09-27-2016 09:26 AM
@Muthukumar S : You need to either add the aws keys in the hadoop command or permanently add them in core-site.xml.
Are you able to do a hadoop fs -ls s3a://${BUCKET_NAME}/ [feel free to add keys accordingly] (This is to isolate authentication and connectivity issue)?
Created 09-28-2016 03:38 AM
My requirement is below. I want to omit keys for Role based Authentication. Now AWS instance is assigned with the role but my hadoop distcp is not working if I provide the command without keys.
<property> <name>fs.s3a.access.key</name> <description>AWS access key ID. Omit for Role-based authentication.</description> </property> <property> <name>fs.s3a.secret.key</name> <description>AWS secret key. Omit for Role-based authentication.</description> </property>
Created 10-13-2016 04:18 AM
Hi All,
At AWS end we need to provide the appropriate permissions for role based aws s3 backup. Now it is working.
Thank you for your valuable comments.