Support Questions

Find answers, ask questions, and share your expertise

Oozie script action containing Sqoop import failing

New Contributor

I am trying to make an oozie workflow action to import data from mysql using sqoop through a shell script (CDH 5.8).


Workflow steps:
1. Delete any existing directories.

2. Java action reads the metadata hive tables and creates table_metadata directory and the *.cf file.

2. Shell script iterates through the table_metadata directory and scan for config files (*.cf). Each file contains a table name to be imported. Then it grabs the table name into the table_name variable which is used in the sqoop import query.


The same script containing Sqoop works fine when I run it from the command line as (sh


However, when I try to run as a workflow through the Oozie (Cloudera Hue GUI) script action, it fails with following error.


Any ideas why Oozie job failing?

Shell Script:


hdfs_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/table_metadata' table_temp_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp 
if $(hadoop fs -test -e $hdfs_path)
for file in $(hadoop fs -ls $hdfs_path | grep -o -e "$hdfs_path/*.*");
echo ${file}
TABLENAME=$(hadoop fs -cat ${file});
sqoop import --connect jdbc:mysql://quickstart.cloudera:3306/retail_db --table departments --username=retail_dba --password=cloudera --direct -m 1 --delete-target-dir --target-dir $table_temp_path




<workflow-app name="RDB2Hive" xmlns="uri:oozie:workflow:0.5">
<start to="fs-1051"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
<action name="fs-1051">
<delete path='${nameNode}/user/cloudera/workflow/table_metadata'/>
<mkdir path='${nameNode}/user/cloudera/workflow/table_metadata'/>
<ok to="java-9025"/>
<error to="Kill"/>
<action name="java-9025">
<ok to="shell-d3bf"/>
<error to="Kill"/>
<action name="shell-d3bf">
<shell xmlns="uri:oozie:shell-action:0.1">
<ok to="End"/>
<error to="Kill"/>
<end name="End"/>




MR Error LOG:

Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(
at org.apache.hadoop.service.AbstractService.start(
at Method)
***Caused by: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo***
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(

Oozie error log:


Stdoutput 2017-02-01 20:57:31,101 INFO [main] sqoop.Sqoop (<init>(92)) - Running Sqoop version: 1.4.6-cdh5.8.0
Stdoutput 2017-02-01 20:57:31,113 WARN [main] tool.BaseSqoopTool ( - Setting your password on the command-line is insecure. Consider using -P instead.
Stdoutput 2017-02-01 20:57:31,304 INFO [main] manager.MySQLManager ( - Preparing to use a MySQL streaming resultset.
Stdoutput 2017-02-01 20:57:31,309 INFO [main] tool.CodeGenTool ( - Beginning code generation
Stdoutput 2017-02-01 20:57:31,560 INFO [main] manager.SqlManager ( - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,579 INFO [main] manager.SqlManager ( - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,582 INFO [main] orm.CompilationManager ( - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Stdoutput 2017-02-01 20:57:32,587 INFO [main] orm.CompilationManager ( - Writing jar file: /tmp/sqoop-yarn/compile/94cbe03d9d51f6ccc47ddd3ca98032be/departments.jar
Stdoutput 2017-02-01 20:57:33,182 INFO [main] tool.ImportTool ( - Destination directory hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp is not present, hence not deleting.
Stdoutput 2017-02-01 20:57:33,187 INFO [main] manager.DirectMySQLManager ( - Beginning mysqldump fast path import
Stdoutput 2017-02-01 20:57:33,187 INFO [main] mapreduce.ImportJobBase ( - Beginning import of departments
Stdoutput 2017-02-01 20:57:33,188 INFO [main] Configuration.deprecation ( - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
Stdoutput 2017-02-01 20:57:33,203 INFO [main] Configuration.deprecation ( - mapred.jar is deprecated. Instead, use mapreduce.job.jar
Stdoutput 2017-02-01 20:57:33,210 INFO [main] Configuration.deprecation ( - is deprecated. Instead, use mapreduce.job.maps
Stdoutput 2017-02-01 20:57:33,253 INFO [main] client.RMProxy ( - Connecting to ResourceManager at localhost/
Stdoutput 2017-02-01 20:57:35,040 INFO [main] db.DBInputFormat ( - Using read commited transaction isolation
Stdoutput 2017-02-01 20:57:35,072 INFO [main] mapreduce.JobSubmitter ( - number of splits:1
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter ( - Submitting tokens for job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter ( - Kind: mapreduce.job, Service: job_1486009475788_0029, Ident: (
Stdoutput 2017-02-01 20:57:35,198 INFO [main] mapreduce.JobSubmitter ( - Kind: RM_DELEGATION_TOKEN, Service:, Ident: (owner=cloudera, renewer=oozie mr token, realUser=oozie, issueDate=1486011413559, maxDate=1486616213559, sequenceNumber=67, masterKeyId=2)
Stdoutput 2017-02-01 20:57:35,439 INFO [main] impl.YarnClientImpl ( - Submitted application application_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job ( - The url to track the job: http://quickstart.cloudera:8088/proxy/application_1486009475788_0032/
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job ( - Running job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job ( - Job job_1486009475788_0032 running in uber mode : false
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job ( - map 0% reduce 0%
Stdoutput 2017-02-01 20:57:41,682 INFO [main] mapred.ClientServiceDelegate ( - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
Stdoutput 2017-02-01 20:57:41,717 INFO [main] mapreduce.Job ( - Job job_1486009475788_0032 failed with state FAILED due to: 
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase ( - The MapReduce job has already been retired. Performance
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase ( - counters are unavailable. To get this information, 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - you will need to enable the completed job store on 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - the jobtracker with:
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - = true
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - mapreduce.jobtracker.persist.jobstatus.hours = 1
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - A jobtracker restart is required for these settings
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase ( - to take effect.
Stdoutput 2017-02-01 20:57:41,726 ERROR [main] tool.ImportTool ( - Error during import: Import job failed!
Exit code of the Shell command 1
<<< Invocation of Shell command completed <<<

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://quickstart.cloudera:8020/user/cloudera/oozie-oozi/0000013-170201202514643-oozie-oozi-W/shell-d3bf--shell/action-data.seq

Oozie Launcher ends



Hi @Pythor, I had the same problem, did you find any solutions ?  Thanks. 

New Contributor

Hi, I had same problem, I had an sqoop script that executed find when its execute in the server console, but when I tried to execute it with oozie failed.