- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Add third-party lib to MapReduce job Error: java.lang.ClassNotFoundException: org.json.JSONObject
- Labels:
-
Apache Hadoop
Created ‎07-07-2016 10:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Command to submit the job:
hadoop jar driver-collection-1.0-SNAPSHOT.jar multipleoutputjson /test/inputjson/json_input.txt /test/outputjson
Error: java.lang.ClassNotFoundException: org.json.JSONObject, see attached.error-message.txt
I have below dependency for org.json.JSONObject (filename: json-20160212.jar) in pom.xml:
<dependency> <groupId>org.json</groupId> <artifactId>json</artifactId> <version>20160212</version> <scope>compile</scope> </dependency>
Code: The codes is pretty much like this.
I have tried the following links but in vain. Please advise your insight.
Created ‎07-08-2016 02:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I used the method - "Distributed cache" from "Hadoop: Add third-party libraries to MapReduce job" and got it working.
The following is my steps:
- Copied json-20160212.jar to /user/root/lib/json-20160212.jar in hdfs.
- Added the following codes in Driver class:
- job.addCacheFile(new Path("/user/root/lib/json-20160212.jar").toUri());
- job.setJarByClass(JSONObject.class);
- Compiled the codes and ran the test.
Still very anxious to learn to solve it by using the following methods:
- Add libjars option
- Add jar files to Hadoop classpath
- Create a fat jar
Thank you.
Created ‎07-08-2016 02:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I used the method - "Distributed cache" from "Hadoop: Add third-party libraries to MapReduce job" and got it working.
The following is my steps:
- Copied json-20160212.jar to /user/root/lib/json-20160212.jar in hdfs.
- Added the following codes in Driver class:
- job.addCacheFile(new Path("/user/root/lib/json-20160212.jar").toUri());
- job.setJarByClass(JSONObject.class);
- Compiled the codes and ran the test.
Still very anxious to learn to solve it by using the following methods:
- Add libjars option
- Add jar files to Hadoop classpath
- Create a fat jar
Thank you.
Created ‎07-08-2016 03:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't see an issue with your steps, what I remember from past experiences working with org.json is that some classes were refactored out and perhaps org.json.JSONOBject is not available in version 20160212 try earlier version like 20140107. This will create a fat jar, <scope> is optional as compile is default action. That or -libjars are viable options, let us know if you're still having issues, I'll try to help.
Created ‎07-09-2016 02:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have tried 20160212 and 20090211, and only got it working for Distributed cache. I got the jar from below link:
https://mvnrepository.com/artifact/org.json/json
Below is my command. How do I use -libjars?
hadoop jar driver-collection-1.0-SNAPSHOT.jar multipleoutputjson /test/inputjson/json_input.txt /test/outputjs
Thank you for your reply.
Created ‎07-12-2016 09:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please see this for example https://dzone.com/articles/using-libjars-option-hadoop
And http://stackoverflow.com/questions/6890087/problem-with-libjars-in-hadoop
