First lets create a sample file in S3:

In the AWS Console , Go to S3 and create a bucket “S3Demo” and pick your region. Upload the file manually by using the upload button (example file name used later in scala: S3HDPTEST.csv)

In the HDP 2.4.0 Sandbox :

Download the aws sdk for java Uploaded it to the hadoop directory. You should see the aws-java-sdk-1.10.65.jar in /usr/hdp/

-rw-r--r-- 1 root root  32380018 2016-03-31 22:02 aws-java-sdk-1.10.65.jar
Change directory to spark/bin

[root@sandbox bin]# cd /usr/hdp/

Start the Spark Scala shell with right aws jars dependencies:

 ./spark-shell  --master yarn-client --jars /usr/hdp/,/usr/hdp/,/usr/hdp/ --driver-memory 512m --executor-memory 512m

Now for some scala code to configure the aws secret keys in hadoopConf

val hadoopConf = sc.hadoopConfiguration;

hadoopConf.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
hadoopConf.set("fs.s3.awsAccessKeyId", "xxxxxxx")
hadoopConf.set("fs.s3.awsSecretAccessKey", "xxxxxxx")

and now read the file from s3 bucket

val myLines = sc.textFile("s3n://s3hdptest/S3HDPTEST.csv");
print count;
I follow exactly the same instructions used in this tutorial and when I execute the Spark Scala shell :

./spark-shell --master yarn-client --jars /usr/hdp/,/usr/hdp/,/usr/hdp/ --driver-memory 512m --executor-memory 512m

I get this exception :

ERROR SparkContext: Error initializing SparkContext. java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated


Caused by: java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors( at java.lang.Class.getConstructor0( at java.lang.Class.newInstance( at java.util.ServiceLoader$LazyIterator.nextService(

Not that all jars included in the spark-shell command exist.

Any Idea about this error ?

Thank you for you help

Hi, require one more JAR file guava-19.0.jar - you should download it and add to jars path.

Hi, looks like simple error: I see s3a in your exception, but I think s3 or s3n should be there.