Member since
12-08-2017
2
Posts
0
Kudos Received
0
Solutions
12-13-2017
06:21 AM
Hi, Just to add some information. We tried to execute this code : https://github.com/satishpandey/aws-spark-samples/blob/master/src/com/spark/aws/samples/s3/SparkS3STSAssumeRole.java But we had issues due to the hadoop properties. The "session.token" property seems to never be used when we set the "fs.s3n.awsSecretAccessKey" and "fs.s3n.awsAccessKeyId" properties. Does anyone succeded to use aws sts in a spark code ? We don't know if we have an issue regarding our configuration settings in the use of the properties or in the temporary credentials generation. Regards,
... View more
12-08-2017
01:22 AM
Hi, We are currently running some tests regarding the authentication for amazon s3 using temporary credentials and encountering the following errors. We wanted to test this snippet : https://github.com/satishpandey/aws-spark-samples/blob/master/src/com/spark/aws/samples/s3/SparkS3STSAssumeRole.java in a cloudera 5.13 cluster with Spark 2.2. We just changed a bit the code to use "BasicAWSCredentials" instead of "InstanceProfileCredentialsProvider". ... Exception in thread "main" java.lang.NoSuchMethodError: com.amazonaws.SDKGlobalConfiguration.isInRegionOptimizedModeEnabled()Z at com.amazonaws.ClientConfigurationFactory.getConfig(ClientConfigurationFactory.java:35) at com.amazonaws.client.builder.AwsClientBuilder.resolveClientConfiguration(AwsClientBuilder.java:163) at com.amazonaws.client.builder.AwsClientBuilder.access$000(AwsClientBuilder.java:52) at com.amazonaws.client.builder.AwsClientBuilder$SyncBuilderParams.<init>(AwsClientBuilder.java:411) at com.amazonaws.client.builder.AwsClientBuilder.getSyncClientParams(AwsClientBuilder.java:354) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) at com.spark.aws.samples.s3.SparkS3STSAssumeRole.main(SparkS3STSAssumeRole.java:57) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ... The issue is comming from the jar "aws-java-sdk.1.10.6" which is comming from the cluster and used instead of the version "1.11.145" provided in the pom.xml (which contained the method). We tried to follow the cloudera documentation : https://www.cloudera.com/documentation/enterprise/latest/topics/sg_aws_credentials.html To set some hadoop properties : -Dfs.s3a.access.key=your_temp_access_key -Dfs.s3a.secret.key=your_temp_secret_key -Dfs.s3a.session.token=your_session_token_from_AmazonSTS -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider But in that case, we had that issue : ... Exception in thread "main" java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified by setting the fs.s3n.awsAccessKeyId and fs.s3n.awsSecretAccessKey properties (respectively). at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:80) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at org.apache.hadoop.fs.s3native.$Proxy36.initialize(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:334) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2800) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958) at org.apache.spark.rdd.RDD.count(RDD.scala:1157) at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455) at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) at com.spark.aws.samples.s3.SparkS3STSAssumeRole.main(SparkS3STSAssumeRole.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ... We generated the token and temporary access / secret key with this java code : ... com.amazonaws.auth.AWSCredentials credentials = new com.amazonaws.auth.BasicAWSCredentials("XXX","XXX"); com.amazonaws.auth.AWSCredentialsProvider credentialsProvider = new com.amazonaws.internal.StaticCredentialsProvider(credentials); com.amazonaws.client.builder.AwsClientBuilder.EndpointConfiguration endp = new com.amazonaws.client.builder.AwsClientBuilder.EndpointConfiguration("sts.eu-west-1.amazonaws.com","eu-west-1"); com.amazonaws.services.securitytoken.AWSSecurityTokenService sts = com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClientBuilder.standard().withCredentials(credentialsProvider).withEndpointConfiguration(endp).build(); com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider provider2 = new com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.Builder("myArnRole", "testSessionName").withStsClient(sts).build(); ... Any information/feedback will be much appreciated. Thanks, Stéfan Le Moing
... View more
Labels:
- Labels:
-
Apache Spark