Created 08-25-2017 07:54 PM
I have a sqoop import that works fine via the command line
~$ sqoop import --connect "jdbc:sqlserver://10.100.197.46:1433;database=rtoISONE" --username hadoop --password XXXXXX --hive-import --hive-database pe rl3 --hive-overwrite -m 1 --table MaxIndex
but when when I try to run it with a oozie workflow it never leaves the RUNNING phase and when I look at it in yarn it sits at 95%, I know that my oozie is set up correctly for one thing because when I run a shell script under it, it completes with out problem.
workflow.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <workflow-app xmlns="uri:oozie:workflow:0.5" name="sqoop-wf"> <global/> <start to="sqoop"/> <action name="sqoop"> <sqoop xmlns="uri:oozie:sqoop-action:0.3"> <job-tracker>${resourceManager}</job-tracker> <name-node>${nameNode}</name-node> <command>${command}</command> </sqoop> <ok to="end"/> <error to="kill"/> </action> <kill name="kill"> <message>${wf:errorMessage(wf:lastErrorNode())}</message> </kill> <end name="end"/> </workflow-app>
job.properties
nameNode=hdfs://hadoopctrl:8020 resourceManager=hadoopctrl:8050 queueName=default oozie.use.system.libpath=true oozie.action.sharelib.for.sqoop=sqoop,hive,hcatalog oozie.wf.application.path=${nameNode}/user/${user.name} command=import --connect "jdbc:sqlserver://10.100.197.46:1433;database=rtoISONE" --username hadoop --password XXXXXX --hive-import --hive-database perl3 --hive-overwrite -m 1 --table MaxIndex
I have my vcores set to 10
I have tried adding different property to my workflow
<property> <name>mapred.reduce.tasks</name> <value>-1</value> </property> <property> <name>mapreduce.job.reduces</name> <value>1</value> </property> <property> <name>mapreduce.job.queuname</name> <value>launcher2</value> </property> <property> <name>mapred.compress.map.output</name> <value>true</value> </property>
Any ides any one has would be much appreciated
Thanks
Created 08-31-2017 12:25 PM
Ok we have resolved our issues, it was a combination of three things; @antin leszczyszyn and @Artem Ervits put me on the right road, I will document how we solved the issues in the hopes that it helps someone else.
1. As Antin pointed out we had a user issue our group had installed apache ranger which changed the hadoop users and
permissions.
2. As Artem pointed out in the link to his tutorial we needed to create a lib folder in the folder that we are running our workflow from and add the jdbc.jar file and add the hive-site.xml and tez-site.xml .
3. When trying to trouble shoot this problem we had changed the scheduler to the fair version, we changed it back to
capacity scheduler and changed maximum-am-resource-percent=0.2 to 0.6
Thanks for the help
Created 08-31-2017 12:25 PM
Ok we have resolved our issues, it was a combination of three things; @antin leszczyszyn and @Artem Ervits put me on the right road, I will document how we solved the issues in the hopes that it helps someone else.
1. As Antin pointed out we had a user issue our group had installed apache ranger which changed the hadoop users and
permissions.
2. As Artem pointed out in the link to his tutorial we needed to create a lib folder in the folder that we are running our workflow from and add the jdbc.jar file and add the hive-site.xml and tez-site.xml .
3. When trying to trouble shoot this problem we had changed the scheduler to the fair version, we changed it back to
capacity scheduler and changed maximum-am-resource-percent=0.2 to 0.6
Thanks for the help
Created 08-31-2017 08:17 PM
You are welcome, glad you've got it sorted.