I have configured the s3 keys (access key and secret key) in a jceks file using hadoop-credential api. Commands used for the same are as below:
hadoop credential create fs.s3a.access.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks
hadoop credential create fs.s3a.secret.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks
Then, I am opening a connection to Spark Thrift Server using beeline and passing the jceks file path in the connection string as below:
beeline -u "jdbc:hive2://hostname:10001/;principal=hive/_HOST@?hadoop.security.credential.provider.path=jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks;
Now, when I try to create an external table with the location in s3, it fails with the below exception:
CREATE EXTERNAL TABLE IF NOT EXISTS test_table_on_s3 (col1 String, col2 String) row format delimited fields terminated by ',' LOCATION 's3a://bucket_name/kalmesh/';
Exception: Error: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://bucket_name/kalmesh: getFileStatus on s3a://bucket_name/kalmesh: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: request_id), S3 Extended Request ID: extended_request_id=) (state=,code=0)
However, this works fine with Hive Thrift Server.
HDP version: HDP 2.5
Spark version: 1.6
Not entirely sure if this was the issue (you seem to have run into it a few years ago), but it is important to understand the following: Hive data is stored on HDFS However, the security policies for HDFS and Hive may be different. In fact it is recommended that you do NOT give hdfs level permissions to anyone for the warehouse directory, and use ranger to give out permissions on the SQL level only to the databases and tables which live there. As a result you might have been comparing apples to pears (trying to validate access by doing a hdfs read, and then using beeline to do a table read which goes via different security policies).
@sam_kalmesh - Were you able to get a resolution for the error posted. We are hitting similar access error when trying to create a table from hive shell using s3a path i.e (s3a://bucket1/test_data_table) :
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403
Caused by: java.nio.file.AccessDeniedException: s3a://bucket1/test_data_table: getFileStatus on s3a://bucket1/test_data_table
If the external path is changed to o3fs style to the same bucket then the tables get successfully created but fails with s3a path style.
The following config settings were added to "Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" as well :
@skommineni As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.