Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to access S3 bucket using S3A, HDFS CLI and Instance Profile

Unable to access S3 bucket using S3A, HDFS CLI and Instance Profile

Explorer

I'm trying to access an S3 buckets using the HDFS utilities like below:

 hdfs dfs -ls s3a://[BUCKET_NAME]/

but I'm getting the error :

-ls: Fatal internal error
com.cloudera.com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

 

On the gateway node where I'm running the command, I don't have an AWS instance profile attached, but I do have one attached on all datanodes and namenodes.  Running this command from one of the datanodes or namenodes works successfully.  Is there a way I can run this command using instance profiles (no stored access keys or credentials) only on datanodes and namenodes.  The reason I'm doing this is that I don't want to allow for direct S3 access from the gateway node. 

 

2 REPLIES 2

Re: Unable to access S3 bucket using S3A, HDFS CLI and Instance Profile

Champion
The cmd will use the instance profile from where it is launched.

So if you want access but not for all you need to specify the key in the S3 URI.
Highlighted

Re: Unable to access S3 bucket using S3A, HDFS CLI and Instance Profile

Cloudera Employee

You can put the s3 credentials in the s3 URI, or you can just pass the parameters on the command line, which is what I prefer, eg:

 

hadoop fs -Dfs.s3a.access.key="" -Dfs.s3a.secret.key="" -ls s3a://bucket-name/

Its also worth knowing that if you run the command like I have given above, it will override any other settings that are defined in the cluster config, such as core-site.xml etc.