Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oozie script action containing Sqoop import failing

Oozie script action containing Sqoop import failing

New Contributor

I am trying to make an oozie workflow action to import data from mysql using sqoop through a shell script (CDH 5.8).

 

Workflow steps:
1. Delete any existing directories.

2. Java action reads the metadata hive tables and creates table_metadata directory and the *.cf file.

2. Shell script iterates through the table_metadata directory and scan for config files (*.cf). Each file contains a table name to be imported. Then it grabs the table name into the table_name variable which is used in the sqoop import query.

 

The same script containing Sqoop works fine when I run it from the command line as (sh script.sh).

 

However, when I try to run as a workflow through the Oozie (Cloudera Hue GUI) script action, it fails with following error.

 

Any ideas why Oozie job failing?

Shell Script:

 

hdfs_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/table_metadata' table_temp_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp 
if $(hadoop fs -test -e $hdfs_path)
then
for file in $(hadoop fs -ls $hdfs_path | grep -o -e "$hdfs_path/*.*");
do
echo ${file}
TABLENAME=$(hadoop fs -cat ${file});
echo $TABLENAME
HDFSPATH=$table_temp_path
sqoop import --connect jdbc:mysql://quickstart.cloudera:3306/retail_db --table departments --username=retail_dba --password=cloudera --direct -m 1 --delete-target-dir --target-dir $table_temp_path
done
fi

 

WorkFlow.xml
----------

 

<workflow-app name="RDB2Hive" xmlns="uri:oozie:workflow:0.5">
<start to="fs-1051"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="fs-1051">
<fs>
<delete path='${nameNode}/user/cloudera/workflow/table_metadata'/>
<mkdir path='${nameNode}/user/cloudera/workflow/table_metadata'/>
</fs>
<ok to="java-9025"/>
<error to="Kill"/>
</action>
<action name="java-9025">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>org.rd2h.app.LoadMetaData</main-class>
<arg>load_metadata</arg>
<arg>/user/cloudera/workflow/table_metadata</arg>
</java>
<ok to="shell-d3bf"/>
<error to="Kill"/>
</action>
<action name="shell-d3bf">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>import_script.sh</exec>
<file>/user/cloudera/workflow/scripts/import_script.sh#import_script.sh</file>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>

 

 

----------

MR Error LOG:

Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1580)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1444)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1402)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1333)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1101)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1540)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1536)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1469)
***Caused by: java.io.FileNotFoundException: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo***
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1575)


Oozie error log:

 

Stdoutput 2017-02-01 20:57:31,101 INFO [main] sqoop.Sqoop (Sqoop.java:<init>(92)) - Running Sqoop version: 1.4.6-cdh5.8.0
Stdoutput 2017-02-01 20:57:31,113 WARN [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1042)) - Setting your password on the command-line is insecure. Consider using -P instead.
Stdoutput 2017-02-01 20:57:31,304 INFO [main] manager.MySQLManager (MySQLManager.java:initOptionDefaults(71)) - Preparing to use a MySQL streaming resultset.
Stdoutput 2017-02-01 20:57:31,309 INFO [main] tool.CodeGenTool (CodeGenTool.java:generateORM(92)) - Beginning code generation
Stdoutput 2017-02-01 20:57:31,560 INFO [main] manager.SqlManager (SqlManager.java:execute(776)) - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,579 INFO [main] manager.SqlManager (SqlManager.java:execute(776)) - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,582 INFO [main] orm.CompilationManager (CompilationManager.java:findHadoopJars(94)) - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Stdoutput 2017-02-01 20:57:32,587 INFO [main] orm.CompilationManager (CompilationManager.java:jar(330)) - Writing jar file: /tmp/sqoop-yarn/compile/94cbe03d9d51f6ccc47ddd3ca98032be/departments.jar
Stdoutput 2017-02-01 20:57:33,182 INFO [main] tool.ImportTool (ImportTool.java:deleteTargetDir(544)) - Destination directory hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp is not present, hence not deleting.
Stdoutput 2017-02-01 20:57:33,187 INFO [main] manager.DirectMySQLManager (DirectMySQLManager.java:importTable(83)) - Beginning mysqldump fast path import
Stdoutput 2017-02-01 20:57:33,187 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runImport(242)) - Beginning import of departments
Stdoutput 2017-02-01 20:57:33,188 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
Stdoutput 2017-02-01 20:57:33,203 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.jar is deprecated. Instead, use mapreduce.job.jar
Stdoutput 2017-02-01 20:57:33,210 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
Stdoutput 2017-02-01 20:57:33,253 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at localhost/127.0.0.1:8032
Stdoutput 2017-02-01 20:57:35,040 INFO [main] db.DBInputFormat (DBInputFormat.java:setTxIsolation(192)) - Using read commited transaction isolation
Stdoutput 2017-02-01 20:57:35,072 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(202)) - number of splits:1
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(291)) - Submitting tokens for job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(293)) - Kind: mapreduce.job, Service: job_1486009475788_0029, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@76f3da25)
Stdoutput 2017-02-01 20:57:35,198 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(293)) - Kind: RM_DELEGATION_TOKEN, Service: 127.0.0.1:8032, Ident: (owner=cloudera, renewer=oozie mr token, realUser=oozie, issueDate=1486011413559, maxDate=1486616213559, sequenceNumber=67, masterKeyId=2)
Stdoutput 2017-02-01 20:57:35,439 INFO [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(260)) - Submitted application application_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job (Job.java:submit(1311)) - The url to track the job: http://quickstart.cloudera:8088/proxy/application_1486009475788_0032/
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1356)) - Running job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1377)) - Job job_1486009475788_0032 running in uber mode : false
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1384)) - map 0% reduce 0%
Stdoutput 2017-02-01 20:57:41,682 INFO [main] mapred.ClientServiceDelegate (ClientServiceDelegate.java:getProxy(277)) - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
Stdoutput 2017-02-01 20:57:41,717 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1397)) - Job job_1486009475788_0032 failed with state FAILED due to: 
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(393)) - The MapReduce job has already been retired. Performance
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(394)) - counters are unavailable. To get this information, 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(395)) - you will need to enable the completed job store on 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(396)) - the jobtracker with:
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(397)) - mapreduce.jobtracker.persist.jobstatus.active = true
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(398)) - mapreduce.jobtracker.persist.jobstatus.hours = 1
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(399)) - A jobtracker restart is required for these settings
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(400)) - to take effect.
Stdoutput 2017-02-01 20:57:41,726 ERROR [main] tool.ImportTool (ImportTool.java:run(631)) - Error during import: Import job failed!
Exit code of the Shell command 1
<<< Invocation of Shell command completed <<<


<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://quickstart.cloudera:8020/user/cloudera/oozie-oozi/0000013-170201202514643-oozie-oozi-W/shell-d3bf--shell/action-data.seq

Oozie Launcher ends
 

 

2 REPLIES 2

Re: Oozie script action containing Sqoop import failing

Hi @Pythor, I had the same problem, did you find any solutions ?  Thanks. 

Re: Oozie script action containing Sqoop import failing

New Contributor

Hi, I had same problem, I had an sqoop script that executed find when its execute in the server console, but when I tried to execute it with oozie failed.

Don't have an account?
Coming from Hortonworks? Activate your account here