Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

fs.s3a.endpoint ignored in hdfs-site.xml

Highlighted

fs.s3a.endpoint ignored in hdfs-site.xml

New Contributor

I'm trying to connect a Cloudera installed Hadoop to Amazon's S3 service in Amazon GovCloud.

 

I'm using Cloudera 5.4. I set the govcloud endpoint in hdfs-site.xml to:

<property>
<name>fs.s3a.endpoint</name>
<description>AWS S3 endpoint to connect to. An up-to-date list is
provided in the AWS Documentation: regions and endpoints. Without this
property, the standard region (s3.amazonaws.com) is assumed.
</description>
<value>s3-us-gov-west-1.amazonaws.com</value>
</property>

 

I also put in the appropriate access key and secret key into hdfs-site.xml.

 

Then, when I run "hadoop fs -ls s3a://bucketname/", I get the error "The AWS Access Key Id you provided does not exist in our records."

 

By default, fs.s3a.endpoint points to the non-GovCloud endpoint of the S3 service. I have tested the non-GovCloud endpoint to be working. (If I put in a access key and secret key and bucket name that exists in my non-GovCloud account, then the connetion is fine and I can list files). So, basically, Hadoop is ignoring the s3-us-gov-west-1.amazonaws.com endpoint I specified in hdfs-site.xml, and always just going directly to the non-GovCloud endpoint. It is reading hdfs-site.xml file though, since it successfully reads the access key and secret keys from there.

 

Any thoughts on how to fix this?

 

Thanks!

2 REPLIES 2
Highlighted

Re: fs.s3a.endpoint ignored in hdfs-site.xml

New Contributor

Hi there,

 

I'm having the same issue, connecting to an S3-compatable (non AWS) endpoint.

 

I'm using normal hadoop 2.6.3 out of the box with Java 1.8 (65). I had to add the aws jar to the HADOOP_CLASSPATH. I have confirmed that the s3a:// connector works fine againt AWS. However, when I set the fs.s3a.endpoint parameter to a different endpoint, it still queries AWS.

 

$ hadoop-2.6.3/bin/hadoop fs -ls s3a://foo:bar@some-bucket/

 

Works for an AWS bucket, fails for a bucket on my S2-compatable storage (auth error).

 

Here's my block in core-site.xml:

 

<property>
<name>fs.s3a.endpoint</name>
<description>AWS S3 endpoint to connect to. An up-to-date list is
provided in the AWS Documentation: regions and endpoints. Without this
property, the standard region (s3.amazonaws.com) is assumed.
</description>
<value>my.local.endpoint.tld</value>
</property>

 

Halp!?

 

Highlighted

Re: fs.s3a.endpoint ignored in hdfs-site.xml

Master Guru
Please use the core-site.xml for S3A properties. The hdfs-site.xml may not
be loaded unless the context is a hdfs:// or related URI, but the
core-site.xml is always loaded.
Don't have an account?
Coming from Hortonworks? Activate your account here