Created 05-19-2017 05:56 AM
I have configured the s3 keys (access key and secret key) in a jceks file using hadoop-credential api. Commands used for the same are as below:
hadoop credential create fs.s3a.access.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks
hadoop credential create fs.s3a.secret.key -provider jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks
Then, I am opening a connection to Spark Thrift Server using beeline and passing the jceks file path in the connection string as below:
beeline -u "jdbc:hive2://hostname:10001/;principal=hive/_HOST@?hadoop.security.credential.provider.path=jceks://hdfs@nn_hostname/tmp/s3creds_test.jceks;
Now, when I try to create an external table with the location in s3, it fails with the below exception:
CREATE EXTERNAL TABLE IF NOT EXISTS test_table_on_s3 (col1 String, col2 String) row format delimited fields terminated by ',' LOCATION 's3a://bucket_name/kalmesh/';
Exception: Error: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException s3a://bucket_name/kalmesh: getFileStatus on s3a://bucket_name/kalmesh: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: request_id), S3 Extended Request ID: extended_request_id=) (state=,code=0)
However, this works fine with Hive Thrift Server.
HDP version: HDP 2.5
Spark version: 1.6
Created 05-19-2017 06:20 PM
(Just so that answers don't get duplicated, the question is also posted at http://stackoverflow.com/questions/44062349/issue-creating-accessing-hive-external-table-with-s3-loc... )
Created 08-03-2021 01:48 PM
Not entirely sure if this was the issue (you seem to have run into it a few years ago), but it is important to understand the following: Hive data is stored on HDFS However, the security policies for HDFS and Hive may be different. In fact it is recommended that you do NOT give hdfs level permissions to anyone for the warehouse directory, and use ranger to give out permissions on the SQL level only to the databases and tables which live there. As a result you might have been comparing apples to pears (trying to validate access by doing a hdfs read, and then using beeline to do a table read which goes via different security policies).
Created on 08-23-2023 04:57 AM - edited 08-23-2023 05:10 AM
@sam_kalmesh - Were you able to get a resolution for the error posted. We are hitting similar access error when trying to create a table from hive shell using s3a path i.e (s3a://bucket1/test_data_table) :
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403
Caused by: java.nio.file.AccessDeniedException: s3a://bucket1/test_data_table: getFileStatus on s3a://bucket1/test_data_table
If the external path is changed to o3fs style to the same bucket then the tables get successfully created but fails with s3a path style.
The following config settings were added to "Hive Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" as well :
fs.s3a.bucket.bucket1.access.key
fs.s3a.impl
fs.s3a.bucket.bucket1.secret.key
fs.s3a.endpoint
fs.s3a.bucket.probe
fs.s3a.change.detection.version.required
fs.s3a.path.style.access
fs.s3a.change.detection.mode
Created 08-23-2023 08:16 AM
@skommineni As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.
Regards,
Diana Torres,Created 08-23-2023 10:17 PM
@DianaTorres - Sure will do..thanks.