Member since
02-06-2021
1
Post
0
Kudos Received
0
Solutions
06-23-2021
10:09 AM
@Arjun_bedi I'm afraid you've just hit a problem which we've only just started encountering: HADOOP-17771 . S3AFS creation fails "Unable to find a region via the region provider chain." This failure surfaces when _all_ the following conditions are met: Deployment outside EC2. Configuration option `fs.s3a.endpoint` is unset. Without the file `~/.aws/config` existing or without a region set in it. Without the JVM system property `aws.region` declaring a region. Without the environment variable `AWS_REGION` declaring a region. You can make this go away by setting the S3 endpoint to s3.amazonaws.com in core-site.xml <property>
<name>fs.s3a.endpoint</name>
<value>s3.amazonaws.com</value>
</property> in your scala code: sc.hadoopConfiguration.set("fs.s3a.endpoint", "s3.amazonaws.com") Even better, if you know the actual region your data lives in, set fs.s3a.endpoint to the regional endpoint. This will save an HTTP request to the central endpoint whenever an S3A filesystem instance is created. We are working on the fix for this and will be backporting it where needed. I was not expecting CDH 6.3.x to be in need of it, but clearly it does.
... View more