Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

java.lang.NoClassDefFoundError: com/amazonaws/services/s3/AmazonS3Client when use s3distcp

avatar
Contributor

I downloaded the s3distcp jar from s3://elasticmapreduce/libs/s3distcp/1.latest/s3distcp.jar, and run it as the following:

hadoop jar ~/s3distcp.jar  --dest hdfs://resource-manager.localhost:8020/user/ec2-user/2015/ --src s3n://test/2015/

 

But it failed, the error message was:

Exception in thread "main" java.lang.NoClassDefFoundError: com/amazonaws/services/s3/AmazonS3Client
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createAmazonS3Client(S3DistCp.java:456)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createInputFileListS3(S3DistCp.java:405)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createInputFileList(S3DistCp.java:380)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:640)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:523)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.amazon.external.elasticmapreduce.s3distcp.Main.main(Main.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I had already copied aws-java-sdk-s3-1.9.7.jar to /opt/cloudera/parcels/CDH/jars on every nodes, and added "/opt/cloudera/parcels/CDH/jars/*" to mapreduce.application.classpath via the cloudera manager web console and restarted the cluster.

1 ACCEPTED SOLUTION

avatar
Contributor

I have resolved this, download aws-java-sdk from http://sdk-for-java.amazonwebservices.com/latest/aws-java-sdk.zip, unzip it, and copy every jar in aws-java-sdk/lib/ and aws-java-sdk/third-party/ to your datanodes' /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/hadoop

View solution in original post

3 REPLIES 3

avatar
Contributor

I used this method to resolve NoClassDefFoundError problems before and it worked everytime, but this time, it didn't 😞

avatar
Contributor

I have resolved this, download aws-java-sdk from http://sdk-for-java.amazonwebservices.com/latest/aws-java-sdk.zip, unzip it, and copy every jar in aws-java-sdk/lib/ and aws-java-sdk/third-party/ to your datanodes' /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/hadoop

avatar
Contributor

Besides, there is a dirty work you need to do, since CDH 5.2 dosen't have the class 

org/apache/hadoop/fs/s3native/ProgressableResettableBufferedFileInputStream

 

You need to extract this class from Amazon EMR and repackage it. Or you can compile one from its source code here:

 

https://github.com/libin/s3distcp/blob/master/src/main/java/com/amazon/external/elasticmapreduce/s3d...