Support Questions

Find answers, ask questions, and share your expertise

Add third-party lib to MapReduce job Error: java.lang.ClassNotFoundException: org.json.JSONObject

Contributor

Command to submit the job:

hadoop jar driver-collection-1.0-SNAPSHOT.jar multipleoutputjson /test/inputjson/json_input.txt /test/outputjson 

Error: java.lang.ClassNotFoundException: org.json.JSONObject, see attached.error-message.txt

I have below dependency for org.json.JSONObject (filename: json-20160212.jar) in pom.xml:

<dependency> 
<groupId>org.json</groupId> 
<artifactId>json</artifactId> 
<version>20160212</version> 
<scope>compile</scope> 
</dependency> 

Code: The codes is pretty much like this.

I have tried the following links but in vain. Please advise your insight.

1 ACCEPTED SOLUTION

Contributor

I used the method - "Distributed cache" from "Hadoop: Add third-party libraries to MapReduce job" and got it working.

The following is my steps:

  • Copied json-20160212.jar to /user/root/lib/json-20160212.jar in hdfs.
  • Added the following codes in Driver class:
    • job.addCacheFile(new Path("/user/root/lib/json-20160212.jar").toUri());
    • job.setJarByClass(JSONObject.class);
  • Compiled the codes and ran the test.

Still very anxious to learn to solve it by using the following methods:

  • Add libjars option
  • Add jar files to Hadoop classpath
  • Create a fat jar

Thank you.

View solution in original post

4 REPLIES 4

Contributor

I used the method - "Distributed cache" from "Hadoop: Add third-party libraries to MapReduce job" and got it working.

The following is my steps:

  • Copied json-20160212.jar to /user/root/lib/json-20160212.jar in hdfs.
  • Added the following codes in Driver class:
    • job.addCacheFile(new Path("/user/root/lib/json-20160212.jar").toUri());
    • job.setJarByClass(JSONObject.class);
  • Compiled the codes and ran the test.

Still very anxious to learn to solve it by using the following methods:

  • Add libjars option
  • Add jar files to Hadoop classpath
  • Create a fat jar

Thank you.

Mentor
@Charles Chen

I don't see an issue with your steps, what I remember from past experiences working with org.json is that some classes were refactored out and perhaps org.json.JSONOBject is not available in version 20160212 try earlier version like 20140107. This will create a fat jar, <scope> is optional as compile is default action. That or -libjars are viable options, let us know if you're still having issues, I'll try to help.

Contributor

I have tried 20160212 and 20090211, and only got it working for Distributed cache. I got the jar from below link:

https://mvnrepository.com/artifact/org.json/json

Below is my command. How do I use -libjars?

hadoop jar driver-collection-1.0-SNAPSHOT.jar multipleoutputjson /test/inputjson/json_input.txt /test/outputjs

Thank you for your reply.

Mentor
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.