I am running an Amazon like S3a endpoint (i.e. not within AWS) and while I can run "hadoop fs" and "distcp" successfully, it seems as though HIVE completely ignores the endpoint. I made the s3a settings in core-site.xml and i've tried in hdfs-site.xml with no luck.
Specifically i'm using the horton sandbox download, so i'm not sure if it's an issue with that.
Has anyone got suggestions on where to look?
Please see these articles which describe how to use Hive and S3. Also - the HDC (Hortonworks Data Cloud) offering with Amazon includes more advanced S3 connectivity capability
"HDC and S3":
"Using AWS S3 as the Hive warehouse":
Thank you, very useful links for when I get the target working. Right now I have a more basic issue in that it's not honoring the s3a endpoint in the core-site. As I mentioned, distcp and hadoop fs are finding my s3a endpoint but HIVE is ignoring it. I tried editing the endpoint into hive-core but ambari said the field was already defined elsewhere, i.e. core.
Any further thoughts to make hive wake up?
Above issue is observed cause of https://issues.apache.org/jira/browse/HIVE-20386. Refer bug for more details.
As a workaround you can try below methods :
Method 1 : Set below config in core-site.xml
fs.s3a.bucket.<bucket_name>.security.credential.provider.path =<jceks_file_path> #Replace <bucket_name> and <jceks_file_path> accordingly.
Method 2 : Set below configs in core-site.xml
fs.s3a.bucket.<bucket_name>.access.key =<s3a access key> fs.s3a.bucket.<bucket_name>.secret.key =<s3a secret key> #Replace <bucket_name> accordingly.
Let us know if the resolution works.