Created 06-30-2017 12:09 PM
I've a Spark job copying file from S3 bucket to HDFS using spark data frame api.
if we change --executor-cores to 1. then the job is successfully completing without any issues. if we change --executor-cores to 5 then it is throwing below error.
java.io.IOException: Cannot find AWS access key.
at org.apache.hadoop.fs.s3a.S3AFileSystem.getAWSAccessKeys(S3AFileSystem.java:358)
Created 06-30-2017 01:03 PM
It is a known bug in Multi-threaded access to CredentialProviderFactory is not thread-safe. I had similar case with one of the customer. Had to apply hdfs patch : HADOOP-14195
Created 06-30-2017 01:03 PM
It is a known bug in Multi-threaded access to CredentialProviderFactory is not thread-safe. I had similar case with one of the customer. Had to apply hdfs patch : HADOOP-14195