Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

java.lang.NoClassDefFoundError: com/amazonaws/services/s3/AmazonS3Client when use s3distcp

avatar
Contributor

I downloaded the s3distcp jar from s3://elasticmapreduce/libs/s3distcp/1.latest/s3distcp.jar, and run it as the following:

hadoop jar ~/s3distcp.jar  --dest hdfs://resource-manager.localhost:8020/user/ec2-user/2015/ --src s3n://test/2015/

 

But it failed, the error message was:

Exception in thread "main" java.lang.NoClassDefFoundError: com/amazonaws/services/s3/AmazonS3Client
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createAmazonS3Client(S3DistCp.java:456)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createInputFileListS3(S3DistCp.java:405)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.createInputFileList(S3DistCp.java:380)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:640)
at com.amazon.external.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:523)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.amazon.external.elasticmapreduce.s3distcp.Main.main(Main.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

I had already copied aws-java-sdk-s3-1.9.7.jar to /opt/cloudera/parcels/CDH/jars on every nodes, and added "/opt/cloudera/parcels/CDH/jars/*" to mapreduce.application.classpath via the cloudera manager web console and restarted the cluster.

1 ACCEPTED SOLUTION

avatar
Contributor

I have resolved this, download aws-java-sdk from http://sdk-for-java.amazonwebservices.com/latest/aws-java-sdk.zip, unzip it, and copy every jar in aws-java-sdk/lib/ and aws-java-sdk/third-party/ to your datanodes' /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/hadoop

View solution in original post

3 REPLIES 3

avatar
Contributor

I used this method to resolve NoClassDefFoundError problems before and it worked everytime, but this time, it didn't 😞

avatar
Contributor

I have resolved this, download aws-java-sdk from http://sdk-for-java.amazonwebservices.com/latest/aws-java-sdk.zip, unzip it, and copy every jar in aws-java-sdk/lib/ and aws-java-sdk/third-party/ to your datanodes' /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/hadoop

avatar
Contributor

Besides, there is a dirty work you need to do, since CDH 5.2 dosen't have the class 

org/apache/hadoop/fs/s3native/ProgressableResettableBufferedFileInputStream

 

You need to extract this class from Amazon EMR and repackage it. Or you can compile one from its source code here:

 

https://github.com/libin/s3distcp/blob/master/src/main/java/com/amazon/external/elasticmapreduce/s3d...