Support Questions

Find answers, ask questions, and share your expertise

Oozie Workflow got hung while executing Hive "insert into" table command

New Contributor

Hi

  I'm using CDH 5.16.0. and I'm trying to execute a simple Oozie workflow for inserting data from one hive table to other. Hive script (hive -f script.hql) ran through CLI is working fine. But the same hql script got hung only while executing "insert into table prod1 select * from prod" hive command through Oozie workflow. Oozie job scheduler always appear in a RUNNING state at the Workflow Manager console with no progress. I finally have to kill the job. This Oozie job hung is coming only for executing "insert into" Hive command.

 

Job.Properties file is given below:

 

nameNode=hdfs://quickstart.cloudera:8020

jobTracker=quickstart.cloudera:8032

queueName=default

oozie.use.system.libpath=true

oozie.libpath=/user/oozie/share/lib

oozie.wf.application.path=${nameNode}/user/hadoop/poc

dbName=test

inputPath=hdfs:///user/hadoop/input.txt

 

script.hql file is given below:

use ${DB_NAME};

create table prod (productid int, productname string, price float, category string) row format delimited fields terminated by ',';

create table prod1 (productid int, productname string, price float, category string) row format delimited fields terminated by ',';

load data inpath '${INPUT_PATH}' into table prod; 

insert into table prod1 select * from prod;

 

input.txt file is given below:

1,hive,25,sql

2,mongodb,30,nosql

 

Workflow.xml file is given below:

<workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-wf">

<start to="hive-node"/>

<action name="hive-node">

<hive xmlns="uri:oozie:hive-action:0.2">

<job-tracker>${jobTracker}</job-tracker>

<name-node>${nameNode}</name-node>

<job-xml>hive-site.xml</job-xml>

<configuration>

<property>

<name>mapred.job.queue.name</name>

<value>${queueName}</value>

</property>

</configuration>

<script>script.hql</script>

<param>DB_NAME=${dbName}</param>

<param>INPUT_PATH=${inputPath}</param>

</hive>

<ok to="end"/>

<error to="fail"/>

</action>

<kill name="fail">

<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>

</kill>

<end name="end"/>

</workflow-app>

1 REPLY 1

Rising Star

Hi,

 

I noticed you are using a quickstart VM, from your nameNode name (hdfs://quickstart.cloudera:8020) in your job.properties.   Most likely, when the oozie launcher launches and stays in RUNNING status, but no further additional hive job launches, yarn does not have enough resources to launch an additional job.  Please take a look at the Yarn Resource Manager role log and Yarn Resource Manager scheduler page for clues.   This could be due to lack of memory (AM, nodemanager, or scheduler), vcores, or several other factors in yarn tuning.   The quickstart VM is tuned for a very small demo environment.   You may need to add additional memory to the VM and/or cores, then tune yarn to have more resources for the second hive job from oozie to launch.   I will provide a link to a blog to aid you in this tuning below.   

 

https://blog.cloudera.com/blog/2015/10/untangling-apache-hadoop-yarn-part-2/

 

 



Robert Justice, Technical Resolution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service