Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Contributor

Hi,

We have a requirement to fetch the data from Amazon s3 and trying to configure hdfs to support that. So far we have configured fs.s3a.secret.key, fs.s3a.access.key these properties and passing credentials as plain text, which works fine. As part of security requirements we are trying to encrypt credentails using Hadoop credentials API, but its failing to read. Getting below error message,

$ hadoop fs -ls s3a://aws-test/testfile.csv

ls: doesBucketExist on wework-mz: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain: Unable to load AWS credentials from any provider in the chain

Steps done:

hadoop credential create fs.s3a.access.key -provider jceks://hdfs/app/awss3/aws.jceks hadoop credential create fs.s3a.secret.key -provider jceks://hdfs/app/awss3/aws.jceks

$ hadoop credential list -provider jceks://hdfs/app/awss3/aws.jceks Listing aliases for CredentialProvider: jceks://hdfs/app/awss3/aws.jceks fs.s3a.secret.key fs.s3a.access.key

Note: NameNode is in HA

7 REPLIES 7

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

@Saikiran Parepally you have to pass the -Dhadoop.security.credential.provider.path argument.

Please see my article here: https://community.hortonworks.com/articles/59161/using-hadoop-credential-api-to-store-aws-secrets.ht...

This property hadoop.security.credential.provider.path is not supported within core-site.xml. If you set fs.s3a.secret.key and fs.s3a.access.key only and restart HDFS, those credentials will be used. But that means all users have access to the bucket(s) to which these IAM credentials are tied.

The best way to pass credentials using Hadoop credentials API is as follows: hdfs dfs -Dhadoop.security.credential.provider.path=jceks://hdfs/user/admin/aws.jceks -ls s3a://my-bucket

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Contributor

Thanks @slachterman .. I have followed the same article as mentioned above. From CLI I am able to successfully access s3 buckets by passing -Dhadoop.security.credential.provider.path=jceks://hdfs/app/awss3/aws.jceks. But if I configure same in core-site.xml, Hadoop services are failing to start with below error. So I configured fs.s3a.secret.key, fs.s3a.access.key, but getting above mentioned error.

Exception in thread "main" java.lang.StackOverflowError at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.exists(File.ava:819) at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1245) at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1212) at sun.misc.URLClassPath$1.next(URLClassPath.java:240) at sun.misc.URLClassPath$1.hasMoreElements(URLClassPath.java:250) at java.net.URLClassLoader$3$1.run(URLClassLoader.java:601) at java.net.URLClassLoader$3$1.run(URLClassLoader.java:599) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader$3.next(URLClassLoader.java:598) at java.net.URLClassLoader$3.hasMoreElements(URLClassLoader.java:623) at sun.misc.CompoundEnumeration.next(CompoundEnumeration.java:45) at sun.misc.CompoundEnumeration.hasMoreElements(CompoundEnumeration.java:54) at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:354) at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393) at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474) at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:293) at java.security.AccessController.doPrivileged(Native Method) at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289) at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2549) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2526) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2418) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1232) at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:675) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:286) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:274) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647) at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2920) at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2910) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2776) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.initFileSystem(JavaKeyStoreProvider.java:89) at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:82) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:49) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:41) at org.apache.hadoop.security.alias.JavaKeyStoreProvider$Factory.createProvider(JavaKeyStoreProvider.java:100) at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:58) at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:1959) at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:1939) at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:621) at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:564) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.security.Groups.<init>(Groups.java:99) at org.apache.hadoop.security.Groups.<init>(Groups.java:95) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:420) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:297) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:274) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647) at or

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Please see updated answer and accept if helpful.

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Contributor

Thanks @slachterman .. what is the best way to pass credentials using Hadoop credentials API? As per documentations I have seen, I can pass using fs.s3a.security.credential.provider.path, but this parameter is not working.

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Please see updated answer.

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Contributor

@slachterman the main reason I want to add credential provider to the hadoop configs is, we are planning to create hive tables on top of s3 data and authorized users (ranger policies) can access that table.

I tried to pass hadoop.security.credential.provider.path in HiveServer2 connection as parameter, but that is not helping to get access to s3. I am getting below error,

Error: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: Cannot modify hadoop.security.credential.provider.path at runtime. It is not in list of params that are allowed to be modified at runtime (state=,code=0)

To address above error, I have added hadoop.security.credential.provider.path to hive.security.authorization.sqlstd.confwhitelist.append, but still I am getting the same error.

Highlighted

Re: Issue passing credential provider for reading amazon s3 data - HDP 2.5.0 (Ambari Managed)

Ok, you've found a new problem. Congratulations. Or commisserations. Filing a bug against that (). the codepath triggering this should only be reached if fs.s3a.security.credential.provider.path is set. That should only be needed if you are hoping to provide a specific set of credentials for different buckets, so customising it for the different bucket (fs.s3a.bucket.dev-1.security.credential.provider.path=/secrets/dev.jceks) etc. If you have one set of secrets for all S3 buckets, set it in the main config for everything. Which you are trying to on the second attempt. Maybe @lmccay has some suggestion.

Don't have an account?
Coming from Hortonworks? Activate your account here