01-25-2017 08:39 AM
I'm trying to access an S3 buckets using the HDFS utilities like below:
hdfs dfs -ls s3a://[BUCKET_NAME]/
but I'm getting the error :
-ls: Fatal internal error com.cloudera.com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
On the gateway node where I'm running the command, I don't have an AWS instance profile attached, but I do have one attached on all datanodes and namenodes. Running this command from one of the datanodes or namenodes works successfully. Is there a way I can run this command using instance profiles (no stored access keys or credentials) only on datanodes and namenodes. The reason I'm doing this is that I don't want to allow for direct S3 access from the gateway node.
01-26-2017 09:16 PM
02-09-2017 06:59 AM
You can put the s3 credentials in the s3 URI, or you can just pass the parameters on the command line, which is what I prefer, eg:
hadoop fs -Dfs.s3a.access.key="" -Dfs.s3a.secret.key="" -ls s3a://bucket-name/
Its also worth knowing that if you run the command like I have given above, it will override any other settings that are defined in the cluster config, such as core-site.xml etc.