Support Questions
Find answers, ask questions, and share your expertise

Is it possible to create two tables with different s3a:// buckets in the same Hive database?

Explorer

I have a question about accessing multiple AWS S3 buckets of different accounts in Hive.

I have several S3 buckets which belongs to different AWS accounts.

I can access one of the buckets in Hive. However I have to write fs.s3a.access.key and fs.s3a.secret.key into hive-site.xml, it means for one instance of Hive, I can only access one AWS S3 account. Is that right?

And I want to use different buckets of different AWS S3 account in one Hive instance, is it possible?

1 ACCEPTED SOLUTION

Accepted Solutions

you can use different buckets which a single account has access to, but, no, you can't do work across accounts, because of that single s3a.access.key property

There is one *dangerous* way to work around that, which is put the key and secret in the URL of the form s3a://key:secret@bucket/path . That encodes the secret in the URL, and takes precedent over anything in the configuration. But those URLs will end up being logged in places, so there's a risk of the secrets getting into the logs. This is why when you authenticate this way, warning messages are printed.

This is something which is going to have to be fixed, not just for the authentication but to deal with the rollout of Amazon's V4 authentication mechanism, where you need to specify the S3 endpoint for the region you need to work with (frankfurt and seol so far) Supporting multiple regions is a similar problem to having multiple accounts: different buckets need different settings.

View solution in original post

3 REPLIES 3

you can use different buckets which a single account has access to, but, no, you can't do work across accounts, because of that single s3a.access.key property

There is one *dangerous* way to work around that, which is put the key and secret in the URL of the form s3a://key:secret@bucket/path . That encodes the secret in the URL, and takes precedent over anything in the configuration. But those URLs will end up being logged in places, so there's a risk of the secrets getting into the logs. This is why when you authenticate this way, warning messages are printed.

This is something which is going to have to be fixed, not just for the authentication but to deal with the rollout of Amazon's V4 authentication mechanism, where you need to specify the S3 endpoint for the region you need to work with (frankfurt and seol so far) Supporting multiple regions is a similar problem to having multiple accounts: different buckets need different settings.

View solution in original post

Thanks for answering this @stevel. I will add a note to the docs that authentication configuration allows you to access all the buckets to which a single account has access and that you cannot work across multiple accounts. I will not add the dangerous workaround unless you recommend that I do...

Explorer

@stevel Thanks for your answer.

@Dominika Thanks for updating the docs.