Support Questions

Find answers, ask questions, and share your expertise

Unable to create a hive table from the .csv file that exists in a directory in S3 location (s3a://test/dir2/) , however the same works when the .csv file is present directly in s3 bucket and not inside any directory ('s3a://test/)

avatar
Explorer
  • We specified the environment variables as below:

 

export AWS_ACCESS_KEY_ID=xMK6bdX8iY**************************************
export AWS_SECRET_KEY=34***************************************

 

  • After connecting to a hive session , we specified the s3a credentials:

 

set fs.s3a.endpoint=cluster.domain.*;
set fs.s3a.access.key=$$$$$$$$$$$$$$###;
set fs.s3a.secret.key=####$$$$;

 

  • Tried to create a table using the below query (with directory location in s3 bucket : (s3a://test/dir2/) and received the preceding error ; even though the s3 credentials were already specified as stated above:

 

0: jdbc:hive2://> CREATE EXTERNAL TABLE s3dir (
. . . . . . . . > col1 int,
. . . . . . . . > col2 string,
. . . . . . . . > col3 string,
. . . . . . . . > col4 string
. . . . . . . . > )
. . . . . . . . > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
. . . . . . . . > LOCATION 's3a://test/dir2/'
. . . . . . . . > TBLPROPERTIES (
. . . . . . . . > "s3select.format" = "csv"
. . . . . . . . > );
22/05/03 03:06:32 [2199007f-0721-4e46-89b6-40cef824235c main]: WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
22/05/03 03:06:36 [HiveServer2-Background-Pool: Thread-71]: ERROR exec.Task: Failed
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://test/dir2: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)))
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1170) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1175) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:140) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:98) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:82) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) [hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) [hive-service-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:88) [hive-service-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:327) [hive-service-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_322]
        at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_322]
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) [hadoop-common-3.1.1.7.1.7.1000-141.jar:?]
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:345) [hive-service-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_322]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_322]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_322]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_322]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_322]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_322]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: java.nio.file.AccessDeniedException s3a://test/dir2: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63918) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63886) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result.read(ThriftHiveMetastore.java:63812) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_req(ThriftHiveMetastore.java:1796) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_req(ThriftHiveMetastore.java:1783) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:3622) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:145) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:1082) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:1067) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_322]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_322]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_322]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_322]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at com.sun.proxy.$Proxy35.createTable(Unknown Source) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_322]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_322]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_322]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_322]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:3515) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        at com.sun.proxy.$Proxy35.createTable(Unknown Source) ~[?:?]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1159) ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
        ... 28 more
22/05/03 03:06:36 [HiveServer2-Background-Pool: Thread-71]: ERROR exec.Task: DDLTask failed, DDL Operation: class org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://test/dir2: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)))

ERROR : FAILED: Execution Error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://test/dir2: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)))
Error: Error while compiling statement: FAILED: Execution Error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://test/dir2: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))) (state=08S01,code=40000)

 

 

  • however the same works when the .csv file is present directly in s3 bucket and not inside any directory ('s3a://test/):

 

 

0: jdbc:hive2://> CREATE EXTERNAL TABLE s3notdir (
. . . . . . . . > col1 int,
. . . . . . . . > col2 string,
. . . . . . . . > col3 string,
. . . . . . . . > col4 string
. . . . . . . . > )
. . . . . . . . > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
. . . . . . . . > LOCATION 's3a://test/'
. . . . . . . . > TBLPROPERTIES (
. . . . . . . . > "s3select.format" = "csv"
. . . . . . . . > );
OK
No rows affected (2.223 seconds)
0: jdbc:hive2://>
2 REPLIES 2

avatar
Master Collaborator

Hi @mmk , please provide what version of CDP you are using, on-prem or public cloud. Also see this documentation page for working with S3: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/cloud-data-access/topics/cr-cda-configuring-a...

 

Kind regards,

Alex

avatar
Explorer

Hi @aakulov,

we are using on-prem(bare-metal clustercloudera-manager.PNG

Cloudera manager version 7.6.1 

Cloudera Runtime 7.1.7 (Parcels)

we configured the AWS credentials the same way as per the links shared by you, but getting unable to load AWS credentials with directory included in s3a Url (ex:"s3a://test/directory")