Created 08-20-2018 09:57 AM
I am trying to run sqoop job to import incremental data from mysql through oozie and workflow is importing data from mysql but instead of importing incremental data from the table, It imports all the existing data from the table. Sqoop job works fine when I tried running it from CLI.
workflow ends with the following error:
2018-08-20 15:10:36,469 WARN SqoopActionExecutor:523 - SERVER[slnxhadoop03.dhcp.noid.in.sopra] USER[shobhna] GROUP[-] TOKEN[] APP[ETL Workflow] JOB[0000016-180813165952618-oozie-oozi-W] ACTION[0000016-180813165952618-oozie-oozi-W@sqoop_extract] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SqoopMain], main() threw exception, org/json/JSONObject 2018-08-20 15:10:36,473 WARN SqoopActionExecutor:523 - SERVER[slnxhadoop03.dhcp.noid.in.sopra] USER[shobhna] GROUP[-] TOKEN[] APP[ETL Workflow] JOB[0000016-180813165952618-oozie-oozi-W] ACTION[0000016-180813165952618-oozie-oozi-W@sqoop_extract] Launcher exception: org/json/JSONObject java.lang.NoClassDefFoundError: org/json/JSONObject at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43) at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:759) at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:399) at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.update(HsqldbJobStorage.java:445) at org.apache.sqoop.tool.ImportTool.saveIncrementalState(ImportTool.java:164) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:528) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:243) at org.apache.sqoop.tool.JobTool.run(JobTool.java:298) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:202) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:182) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:51) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:242) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.ClassNotFoundException: org.json.JSONObject at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 32 more
Created 08-20-2018 11:28 AM
The sqoop commands from command line and the ones that will work to completion in oozie are going to be slightly different. This difference is based on container enviroment, JAR paths, and permissions.
The only way to trouble shoot is to go into YARN UI, and click deep into the log for the containers and the application failure(s). You are specifically looking for the logs of the failed container NOT the workflow. Be careful as there are several higher level of logs that are not always going to show the errors. You will most likely be looking in every possible place.
For my oozie / sqoop job the click through path for log inspection is:
1. From workflow tab, click into the job.
2. Inspect the log tabs here.
3. Follow link to the Job into the Yarn Resource UI.
2. Next find the container that executed the job(s) and click on the link to those Logs. This can be 1 or 2 pages to click through so be sure to inspect all links into sub pages.
3. Inspect the log output at the deepest levels for additional information and you will usually find the error you need to solve.
Once you have the actual failures, it is usually descriptive enough to direct some adjustment in the workflow. Retry the application and repeat above until resolved.
For your specific error: "Launcher exception: org/json/JSONObject java.lang.NoClassDefFoundError:"
It appears your oozie job is missing some required JARs. When you build the job, did you choose "Use system lib path"? If not you will need to provide additional configuration in the workflow settings. Once the workflow environment is setup the same as the command line sqoop it should execute to SUCCEEDED.
Good luck, and if this answer helps, please choose ACCEPT.