Created on 12-12-2018 06:41 PM - edited 09-16-2022 06:59 AM
Hello,
I tried to do the following operations in Cloudera machine for impala using impala-shell
https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_s3.html
In this Cloudera machine I verified the S3 connectivity using the following command and was able to access
hdfs dfs -Dfs.s3a.access.key=myaccesskey -Dfs.s3a.secret.key=mysecretkey -ls s3a://myclouderaraj/root
Updated [root@quickstart ~]# cd /etc/hadoop/conf/hdfs-site.xml file for access and secret key.
Restarted all the services, also tried restarting cloudera-scm-server.
Then ran the following from impala-shell
[localhost:21000] > create database db_on_hdfs; [localhost:21000] > use db_on_hdfs; [localhost:21000] > create table mostly_on_hdfs (x int) partitioned by (year int); [localhost:21000] > alter table mostly_on_hdfs add partition (year=2013); [localhost:21000] > alter table mostly_on_hdfs add partition (year=2014); [localhost:21000] > alter table mostly_on_hdfs add partition (year=2015) location 's3a://impala-demo/dir1/dir2/dir3/t1';
[quickstart.cloudera:21000] > alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root';
Query: alter table mostly_on_hdfs add partition (year=2015) LOCATION 's3a://myclouderaraj/root'
ERROR: AnalysisException: null
CAUSED BY: AmazonClientException: Unable to load AWS credentials from any provider in the chain
[quickstart.cloudera:21000] >
Can you please help.
Thanks & Regards,
Rajesh
Created 12-13-2018 06:57 AM
Yes you need to add it to core-site.xml for Impala to have this configuration
Created on 12-13-2018 03:35 AM - edited 12-13-2018 05:40 AM
Hello,
Now I was able to run the same query by specifying like this. Not sure does the one which is failed requires any more configuration to be set.
alter table mostly_on_hdfs add partition (year=2015) location 's3a://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@myclouderaraj/root';
My intension of running these queries to get metadata and lineage information of S3 items in Cloudera Navigator. But I see nothing related to S3 in Cloudera Navigator.
Thanks,
Rajesh
Created 12-13-2018 06:57 AM
Yes you need to add it to core-site.xml for Impala to have this configuration
Created 05-14-2020 11:53 AM
Hi Seth
As described in the document you linked, I added the following entries to my core-site.xml
<property> <name>fs.s3a.access.key</name> <value>your_access_key</value> </property> <property> <name>fs.s3a.secret.key</name> <value>your_secret_key</value> </property>
I then restarted Impala and hive services.
But when i issue impala shell command to create a table whose files are stored on S3 i am still getting an error about S3 credentials not being available.
This is the command -
impala-shell -i serverName -d schemaName -q "CREATE TABLE s3_test_tbl( \
yr_mnth STRING , \
p_id DOUBLE , \
p_full_na STRING
) \
STORED AS PARQUET \
LOCATION 's3a://bucketname/path/'"
And this is the error i get -
No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: InterruptedIOException: doesBucketExist on biapps-snowflake-sbx-ascap: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SdkClientException: Unable to load credentials from service endpoint
CAUSED BY: SocketTimeoutException: connect timed out