Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

S3 connectivity using impala shell

avatar
Contributor

Hello,

 

I tried to do the following operations in Cloudera machine for impala using impala-shell

https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_s3.html

 

In this Cloudera machine I verified the S3 connectivity using the following command and was able to access

hdfs dfs -Dfs.s3a.access.key=myaccesskey -Dfs.s3a.secret.key=mysecretkey -ls s3a://myclouderaraj/root

 

Updated [root@quickstart ~]# cd /etc/hadoop/conf/hdfs-site.xml file for access and secret key.

Restarted all the services, also tried restarting cloudera-scm-server.

Then ran the following from impala-shell

[localhost:21000] > create database db_on_hdfs;
[localhost:21000] > use db_on_hdfs;
[localhost:21000] > create table mostly_on_hdfs (x int) partitioned by (year int);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2013);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2014);
[localhost:21000] > alter table mostly_on_hdfs add partition (year=2015) location 's3a://impala-demo/dir1/dir2/dir3/t1';

[quickstart.cloudera:21000] > alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root';
Query: alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root'
ERROR: AnalysisException: null
CAUSED BY: AmazonClientException: Unable to load AWS credentials from any provider in the chain

[quickstart.cloudera:21000] >

 

Can you please help.

 

Thanks & Regards,

Rajesh

1 ACCEPTED SOLUTION

avatar
Moderator

Yes you need to add it to core-site.xml for Impala to have this configuration

 

 
 

View solution in original post

3 REPLIES 3

avatar
Contributor

Hello,

 

Now I was able to run the same query by specifying like this. Not sure does the one which is failed requires any more configuration to be set.

 

alter table mostly_on_hdfs add partition (year=2015) location 's3a://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@myclouderaraj/root';

 

My intension of running these queries to get metadata and lineage information of S3 items in Cloudera Navigator. But I see nothing related to S3 in Cloudera Navigator.

 

 

Thanks,

Rajesh

 

avatar
Moderator

Yes you need to add it to core-site.xml for Impala to have this configuration

 

 
 

avatar
New Member

Hi Seth

As described in the document you linked, I added the following entries to my core-site.xml

<property>
<name>fs.s3a.access.key</name>
<value>your_access_key</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>your_secret_key</value>
</property>

I then restarted Impala and hive services. 

But when i issue impala shell command to create a table whose files are stored on S3  i am still getting an error about S3 credentials not being available. 

 

This is the command - 

 

impala-shell -i serverName -d schemaName -q "CREATE TABLE s3_test_tbl( \
yr_mnth STRING , \
p_id DOUBLE , \
p_full_na STRING 
) \
STORED AS PARQUET \
LOCATION 's3a://bucketname/path/'"

 

And this is the error i get - 

 

No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: InterruptedIOException: doesBucketExist on biapps-snowflake-sbx-ascap: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SocketTimeoutException: connect timed out