Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

oozie sqoop action hangs at 95%

avatar
New Member

I have a sqoop import that works fine via the command line

~$ sqoop import --connect "jdbc:sqlserver://10.100.197.46:1433;database=rtoISONE" --username hadoop --password XXXXXX --hive-import --hive-database pe rl3 --hive-overwrite -m 1 --table MaxIndex

but when when I try to run it with a oozie workflow it never leaves the RUNNING phase and when I look at it in yarn it sits at 95%, I know that my oozie is set up correctly for one thing because when I run a shell script under it, it completes with out problem.

workflow.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<workflow-app xmlns="uri:oozie:workflow:0.5" name="sqoop-wf">  
  <global/>  
  <start to="sqoop"/> 
  <action name="sqoop">  
    <sqoop xmlns="uri:oozie:sqoop-action:0.3">  
      <job-tracker>${resourceManager}</job-tracker>  
      <name-node>${nameNode}</name-node>
      <command>${command}</command>  
    </sqoop>  
    <ok to="end"/> 
    <error to="kill"/>
  </action>  
  <kill name="kill">  
    <message>${wf:errorMessage(wf:lastErrorNode())}</message>  
  </kill>  <end name="end"/>
</workflow-app>

job.properties

nameNode=hdfs://hadoopctrl:8020
resourceManager=hadoopctrl:8050
queueName=default
oozie.use.system.libpath=true
oozie.action.sharelib.for.sqoop=sqoop,hive,hcatalog
oozie.wf.application.path=${nameNode}/user/${user.name}
command=import --connect "jdbc:sqlserver://10.100.197.46:1433;database=rtoISONE" --username hadoop --password XXXXXX --hive-import --hive-database perl3 --hive-overwrite -m 1 --table MaxIndex

I have my vcores set to 10

I have tried adding different property to my workflow

<property> 
  <name>mapred.reduce.tasks</name>  
  <value>-1</value>  
</property>  
<property>  
  <name>mapreduce.job.reduces</name>  
  <value>1</value>  
</property>  
<property>  
  <name>mapreduce.job.queuname</name>  
  <value>launcher2</value>  
</property>  
<property>  
  <name>mapred.compress.map.output</name>  
  <value>true</value>  
</property>

Any ides any one has would be much appreciated

Thanks

	
1 ACCEPTED SOLUTION

avatar
New Member

Ok we have resolved our issues, it was a combination of three things; @antin leszczyszyn and @Artem Ervits put me on the right road, I will document how we solved the issues in the hopes that it helps someone else.

1. As Antin pointed out we had a user issue our group had installed apache ranger which changed the hadoop users and

permissions.

2. As Artem pointed out in the link to his tutorial we needed to create a lib folder in the folder that we are running our workflow from and add the jdbc.jar file and add the hive-site.xml and tez-site.xml .

3. When trying to trouble shoot this problem we had changed the scheduler to the fair version, we changed it back to

capacity scheduler and changed maximum-am-resource-percent=0.2 to 0.6

Thanks for the help

View solution in original post

11 REPLIES 11

avatar
New Member

Ok we have resolved our issues, it was a combination of three things; @antin leszczyszyn and @Artem Ervits put me on the right road, I will document how we solved the issues in the hopes that it helps someone else.

1. As Antin pointed out we had a user issue our group had installed apache ranger which changed the hadoop users and

permissions.

2. As Artem pointed out in the link to his tutorial we needed to create a lib folder in the folder that we are running our workflow from and add the jdbc.jar file and add the hive-site.xml and tez-site.xml .

3. When trying to trouble shoot this problem we had changed the scheduler to the fair version, we changed it back to

capacity scheduler and changed maximum-am-resource-percent=0.2 to 0.6

Thanks for the help

avatar

You are welcome, glad you've got it sorted.