Support Questions

Find answers, ask questions, and share your expertise

Unable to access S3 bucket using S3A, HDFS CLI and Instance Profile

avatar
Contributor

I'm trying to access an S3 buckets using the HDFS utilities like below:

 hdfs dfs -ls s3a://[BUCKET_NAME]/

but I'm getting the error :

-ls: Fatal internal error
com.cloudera.com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

 

On the gateway node where I'm running the command, I don't have an AWS instance profile attached, but I do have one attached on all datanodes and namenodes.  Running this command from one of the datanodes or namenodes works successfully.  Is there a way I can run this command using instance profiles (no stored access keys or credentials) only on datanodes and namenodes.  The reason I'm doing this is that I don't want to allow for direct S3 access from the gateway node. 

 

2 REPLIES 2

avatar
Champion
The cmd will use the instance profile from where it is launched.

So if you want access but not for all you need to specify the key in the S3 URI.

avatar
Contributor

You can put the s3 credentials in the s3 URI, or you can just pass the parameters on the command line, which is what I prefer, eg:

 

hadoop fs -Dfs.s3a.access.key="" -Dfs.s3a.secret.key="" -ls s3a://bucket-name/

Its also worth knowing that if you run the command like I have given above, it will override any other settings that are defined in the cluster config, such as core-site.xml etc.