Created 01-05-2016 03:38 AM
I'm trying to test Oozie's Sqoop action in the following environment:
Via the command line, the following sqoop command works:
sqoop import \ -D mapred.task.timeout=0 \ --connect jdbc:sqlserver://x.x.x.x:1433;database=CEMHistorical \ --table MsgCallArrival \ --username hadoop \ --password-file hdfs:///user/sqoop/.adg.password \ --hive-import \ --create-hive-table \ --hive-table develop.oozie \ --split-by TimeStamp \ --map-column-hive Call_ID=STRING,Stream_ID=STRING
But when I try to execute the same command via Oozie, I'm running into java.io.IOException: No columns to generate for ClassWriter
Below are my `job.properties` and `workflow.xml`:
nameNode=hdfs://host.vitro.com:8020 jobTracker=host.vitro.com:8050 projectRoot=${nameNode}/user/${user.name}/tmp/sqoop-test/ oozie.use.system.libpath=true oozie.wf.application.path=${projectRoot} <workflow-app name="sqoop-test-wf" xmlns="uri:oozie:workflow:0.4"> <start to="sqoop-import"/> <action name="sqoop-import" retry-max="10" retry-interval="1"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <command>import -D mapred.task.timeout=0 --connect jdbc:sqlserver://x.x.x.x:1433;database=CEMHistorical --table MsgCallArrival --username hadoop --password-file hdfs:///user/sqoop/.adg.password --hive-import --create-hive-table --hive-table develop.oozie --split-by TimeStamp --map-column-hive Call_ID=STRING,Stream_ID=STRING</command> </sqoop> <ok to="end"/> <error to="errorcleanup"/> </action> <kill name="errorcleanup"> <message>Sqoop Test WF failed. [${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name ="end"/> </workflow-app>
I've attached the full log, but here's an excerpt:
2016-01-05 11:29:21,415 ERROR [main] tool.ImportTool (ImportTool.java:run(613)) - Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
I've been struggling with this problem for quite some time now, any help would be greatly appreciated!
Created 02-03-2016 02:03 AM
At the same time that I was getting this issue, I was also dealing with a network issue when trying to issue Sqoop commands via CLI. Although the network issue was resolved and I stopped seeing this IOException, I kept running into new errors that I never managed to resolve.
In the end, I decided to work around it by breaking the hive import into a 2-step workflow:
UPDATE:
It turns out that the "new errors" was because the "yarn" user doesn't belong to the "hdfs" group and so couldn't perform the hive-import part. Adding this use to the group allows me now to use hive-import in my worfklows instead of the 2-step workflow I used before.
Created 01-05-2016 03:45 AM
This error "No columns to generate forClassWriter" is thrown when no results are retrieved by the command.
So there is something wrong with the command. It appears that it could be the jdbc url.
As per MSDN, the correct syntax for named database uses the term "databaseName" and not "database" (like you have it in the snippet above). So try changing that.
Connect to a named database on a remote server: jdbc:sqlserver://localhost;databaseName=AdventureWorks;integratedSecurity=true;
Created 01-05-2016 05:20 AM
Created 02-03-2016 01:47 AM
@Luis Antonio Torres has this been resolved? Can you provide your solution or accept the best answer?
Created 02-03-2016 01:56 AM
@Artem Ervits I never got sqoop's hive-import to work in an Oozie workflow, so I came up with a workaround instead. Will provide my workaround as an answer. Thanks.
Created 02-03-2016 02:03 AM
At the same time that I was getting this issue, I was also dealing with a network issue when trying to issue Sqoop commands via CLI. Although the network issue was resolved and I stopped seeing this IOException, I kept running into new errors that I never managed to resolve.
In the end, I decided to work around it by breaking the hive import into a 2-step workflow:
UPDATE:
It turns out that the "new errors" was because the "yarn" user doesn't belong to the "hdfs" group and so couldn't perform the hive-import part. Adding this use to the group allows me now to use hive-import in my worfklows instead of the 2-step workflow I used before.
Created 02-03-2016 03:42 PM
As far as I see sqoop action above, I don't see hive-site.xml file. I guess you added it into lib directory in the deployment directory, which will keep Hive action from running and it gives you error something like hive-site.xml permssion error. You should add the hive-site.xml file in "Files" in the Sqoop action.
Created 02-04-2016 12:33 AM
@Shigeru Takehara I did try adding hive-site.xml and placed it in the root of the workflow directory on HDFS, but I was running into the error and the error message in the logs is this:
ERROR [main] tool.ImportTool (ImportTool.java:run(613)) - Encountered IOException running import job: java.io.IOException: Hive exited with status 1
I eventually had to go with my workaround because I couldn't get hive import to work and I had deadlines to meet. I'd still like to try and get hive import to work though
Created 02-04-2016 03:04 AM
Where did you copy your jdbc driver for Sqoop action?
Created 02-09-2016 06:17 AM
Hi @Shigeru Takehara, the jdbc driver was in the Oozie sharelib. Also, the import was working fine - I see the data within HDFS. It's just the loading into Hive that fails. That's why I just opted to break the import into two separate steps.