Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

S3 connectivity using impala shell

avatar
Contributor

Hello,

 

I tried to do the following operations in Cloudera machine for impala using impala-shell

https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_s3.html

 

In this Cloudera machine I verified the S3 connectivity using the following command and was able to access

hdfs dfs -Dfs.s3a.access.key=myaccesskey -Dfs.s3a.secret.key=mysecretkey -ls s3a://myclouderaraj/root

 

Updated [root@quickstart ~]# cd /etc/hadoop/conf/hdfs-site.xml file for access and secret key.

Restarted all the services, also tried restarting cloudera-scm-server.

Then ran the following from impala-shell

[localhost:21000] > create database db_on_hdfs;
[localhost:21000] > use db_on_hdfs;
[localhost:21000] > create table mostly_on_hdfs (x int) partitioned by (year int);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2013);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2014);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2015) location 's3a://impala-demo/dir1/dir2/dir3/t1';

[quickstart.cloudera:21000] > alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root';
Query: alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root'
ERROR: AnalysisException: null
CAUSED BY: AmazonClientException: Unable to load AWS credentials from any provider in the chain

[quickstart.cloudera:21000] >

 

Can you please help.

 

Thanks & Regards,

Rajesh

1 ACCEPTED SOLUTION

avatar
Contributor

Yes you need to add it to core-site.xml for Impala to have this configuration

 

 
 

View solution in original post

3 REPLIES 3

avatar
Contributor

Hello,

 

Now I was able to run the same query by specifying like this. Not sure does the one which is failed requires any more configuration to be set.

 

alter table mostly_on_hdfs add partition (year=2015) location 's3a://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@myclouderaraj/root';

 

My intension of running these queries to get metadata and lineage information of S3 items in Cloudera Navigator. But I see nothing related to S3 in Cloudera Navigator.

 

 

Thanks,

Rajesh

 

avatar
Contributor

Yes you need to add it to core-site.xml for Impala to have this configuration

 

 
 

avatar
New Contributor

Hi Seth

As described in the document you linked, I added the following entries to my core-site.xml

<property>
<name>fs.s3a.access.key</name>
<value>your_access_key</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>your_secret_key</value>
</property>

I then restarted Impala and hive services. 

But when i issue impala shell command to create a table whose files are stored on S3  i am still getting an error about S3 credentials not being available. 

 

This is the command - 

 

impala-shell -i serverName -d schemaName -q "CREATE TABLE s3_test_tbl( \
yr_mnth STRING , \
p_id DOUBLE , \
p_full_na STRING 
) \
STORED AS PARQUET \
LOCATION 's3a://bucketname/path/'"

 

And this is the error i get - 

 

No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: InterruptedIOException: doesBucketExist on biapps-snowflake-sbx-ascap: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SocketTimeoutException: connect timed out