Created 06-04-2017 11:01 PM
Hi,
We have a requirement to fetch the data from Amazon s3 and trying to configure hdfs to support that. So far we have configured fs.s3a.secret.key, fs.s3a.access.key these properties and passing credentials as plain text, which works fine. As part of security requirements we are trying to encrypt credentails using Hadoop credentials API, but its failing to read. Getting below error message,
$ hadoop fs -ls s3a://aws-test/testfile.csv
ls: doesBucketExist on wework-mz: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain: Unable to load AWS credentials from any provider in the chain
Steps done:
hadoop credential create fs.s3a.access.key -provider jceks://hdfs/app/awss3/aws.jceks hadoop credential create fs.s3a.secret.key -provider jceks://hdfs/app/awss3/aws.jceks
$ hadoop credential list -provider jceks://hdfs/app/awss3/aws.jceks Listing aliases for CredentialProvider: jceks://hdfs/app/awss3/aws.jceks fs.s3a.secret.key fs.s3a.access.key
Note: NameNode is in HA
Created 06-05-2017 06:44 PM
@Saikiran Parepally you have to pass the -Dhadoop.security.credential.provider.path argument.
Please see my article here: https://community.hortonworks.com/articles/59161/using-hadoop-credential-api-to-store-aws-secrets.ht...
This property hadoop.security.credential.provider.path is not supported within core-site.xml. If you set fs.s3a.secret.key and fs.s3a.access.key only and restart HDFS, those credentials will be used. But that means all users have access to the bucket(s) to which these IAM credentials are tied.
The best way to pass credentials using Hadoop credentials API is as follows: hdfs dfs -Dhadoop.security.credential.provider.path=jceks://hdfs/user/admin/aws.jceks -ls s3a://my-bucket
Created 06-05-2017 07:30 PM
Thanks @slachterman .. I have followed the same article as mentioned above. From CLI I am able to successfully access s3 buckets by passing -Dhadoop.security.credential.provider.path=jceks://hdfs/app/awss3/aws.jceks. But if I configure same in core-site.xml, Hadoop services are failing to start with below error. So I configured fs.s3a.secret.key, fs.s3a.access.key, but getting above mentioned error.
Exception in thread "main" java.lang.StackOverflowError at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.exists(File.ava:819) at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1245) at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1212) at sun.misc.URLClassPath$1.next(URLClassPath.java:240) at sun.misc.URLClassPath$1.hasMoreElements(URLClassPath.java:250) at java.net.URLClassLoader$3$1.run(URLClassLoader.java:601) at java.net.URLClassLoader$3$1.run(URLClassLoader.java:599) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader$3.next(URLClassLoader.java:598) at java.net.URLClassLoader$3.hasMoreElements(URLClassLoader.java:623) at sun.misc.CompoundEnumeration.next(CompoundEnumeration.java:45) at sun.misc.CompoundEnumeration.hasMoreElements(CompoundEnumeration.java:54) at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:354) at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393) at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474) at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:293) at java.security.AccessController.doPrivileged(Native Method) at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289) at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2549) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2526) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2418) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1232) at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:675) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:286) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:274) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647) at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2920) at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2910) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2776) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.initFileSystem(JavaKeyStoreProvider.java:89) at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:82) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:49) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:41) at org.apache.hadoop.security.alias.JavaKeyStoreProvider$Factory.createProvider(JavaKeyStoreProvider.java:100) at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:58) at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:1959) at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:1939) at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:621) at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:564) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.security.Groups.<init>(Groups.java:99) at org.apache.hadoop.security.Groups.<init>(Groups.java:95) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:420) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:297) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:274) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647) at or
Created 06-05-2017 08:11 PM
Please see updated answer and accept if helpful.
Created 06-05-2017 09:07 PM
Thanks @slachterman .. what is the best way to pass credentials using Hadoop credentials API? As per documentations I have seen, I can pass using fs.s3a.security.credential.provider.path, but this parameter is not working.
Created 06-05-2017 09:10 PM
Please see updated answer.
Created 06-06-2017 07:28 PM
@slachterman the main reason I want to add credential provider to the hadoop configs is, we are planning to create hive tables on top of s3 data and authorized users (ranger policies) can access that table.
I tried to pass hadoop.security.credential.provider.path in HiveServer2 connection as parameter, but that is not helping to get access to s3. I am getting below error,
Error: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: Cannot modify hadoop.security.credential.provider.path at runtime. It is not in list of params that are allowed to be modified at runtime (state=,code=0)
To address above error, I have added hadoop.security.credential.provider.path to hive.security.authorization.sqlstd.confwhitelist.append, but still I am getting the same error.
Created 06-29-2017 07:54 PM
Ok, you've found a new problem. Congratulations. Or commisserations. Filing a bug against that (). the codepath triggering this should only be reached if fs.s3a.security.credential.provider.path is set. That should only be needed if you are hoping to provide a specific set of credentials for different buckets, so customising it for the different bucket (fs.s3a.bucket.dev-1.security.credential.provider.path=/secrets/dev.jceks) etc. If you have one set of secrets for all S3 buckets, set it in the main config for everything. Which you are trying to on the second attempt. Maybe @lmccay has some suggestion.