Created 12-09-2016 04:10 AM
I have a question about accessing multiple AWS S3 buckets of different accounts in Hive.
I have several S3 buckets which belongs to different AWS accounts.
I can access one of the buckets in Hive. However I have to write fs.s3a.access.key and fs.s3a.secret.key into hive-site.xml, it means for one instance of Hive, I can only access one AWS S3 account. Is that right?
And I want to use different buckets of different AWS S3 account in one Hive instance, is it possible?
Created 12-12-2016 10:56 AM
you can use different buckets which a single account has access to, but, no, you can't do work across accounts, because of that single s3a.access.key property
There is one *dangerous* way to work around that, which is put the key and secret in the URL of the form s3a://key:secret@bucket/path . That encodes the secret in the URL, and takes precedent over anything in the configuration. But those URLs will end up being logged in places, so there's a risk of the secrets getting into the logs. This is why when you authenticate this way, warning messages are printed.
This is something which is going to have to be fixed, not just for the authentication but to deal with the rollout of Amazon's V4 authentication mechanism, where you need to specify the S3 endpoint for the region you need to work with (frankfurt and seol so far) Supporting multiple regions is a similar problem to having multiple accounts: different buckets need different settings.
Created 12-12-2016 10:56 AM
you can use different buckets which a single account has access to, but, no, you can't do work across accounts, because of that single s3a.access.key property
There is one *dangerous* way to work around that, which is put the key and secret in the URL of the form s3a://key:secret@bucket/path . That encodes the secret in the URL, and takes precedent over anything in the configuration. But those URLs will end up being logged in places, so there's a risk of the secrets getting into the logs. This is why when you authenticate this way, warning messages are printed.
This is something which is going to have to be fixed, not just for the authentication but to deal with the rollout of Amazon's V4 authentication mechanism, where you need to specify the S3 endpoint for the region you need to work with (frankfurt and seol so far) Supporting multiple regions is a similar problem to having multiple accounts: different buckets need different settings.
Created 12-13-2016 11:37 PM
Created 12-15-2016 05:21 AM
@stevel Thanks for your answer.
@Dominika Thanks for updating the docs.