Reply
New Contributor
Posts: 1
Registered: ‎02-02-2017

Oozie script action containing Sqoop import failing

I am trying to make an oozie workflow action to import data from mysql using sqoop through a shell script (CDH 5.8).

 

Workflow steps:
1. Delete any existing directories.

2. Java action reads the metadata hive tables and creates table_metadata directory and the *.cf file.

2. Shell script iterates through the table_metadata directory and scan for config files (*.cf). Each file contains a table name to be imported. Then it grabs the table name into the table_name variable which is used in the sqoop import query.

 

The same script containing Sqoop works fine when I run it from the command line as (sh script.sh).

 

However, when I try to run as a workflow through the Oozie (Cloudera Hue GUI) script action, it fails with following error.

 

Any ideas why Oozie job failing?

Shell Script:

 

hdfs_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/table_metadata' table_temp_path='hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp 
if $(hadoop fs -test -e $hdfs_path)
then
for file in $(hadoop fs -ls $hdfs_path | grep -o -e "$hdfs_path/*.*");
do
echo ${file}
TABLENAME=$(hadoop fs -cat ${file});
echo $TABLENAME
HDFSPATH=$table_temp_path
sqoop import --connect jdbc:mysql://quickstart.cloudera:3306/retail_db --table departments --username=retail_dba --password=cloudera --direct -m 1 --delete-target-dir --target-dir $table_temp_path
done
fi

 

WorkFlow.xml
----------

 

<workflow-app name="RDB2Hive" xmlns="uri:oozie:workflow:0.5">
<start to="fs-1051"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="fs-1051">
<fs>
<delete path='${nameNode}/user/cloudera/workflow/table_metadata'/>
<mkdir path='${nameNode}/user/cloudera/workflow/table_metadata'/>
</fs>
<ok to="java-9025"/>
<error to="Kill"/>
</action>
<action name="java-9025">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>org.rd2h.app.LoadMetaData</main-class>
<arg>load_metadata</arg>
<arg>/user/cloudera/workflow/table_metadata</arg>
</java>
<ok to="shell-d3bf"/>
<error to="Kill"/>
</action>
<action name="shell-d3bf">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>import_script.sh</exec>
<file>/user/cloudera/workflow/scripts/import_script.sh#import_script.sh</file>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>

 

 

----------

MR Error LOG:

Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1580)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1444)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1402)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1333)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1101)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1540)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1536)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1469)
***Caused by: java.io.FileNotFoundException: File does not exist: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1486009475788_0032/job.splitmetainfo***
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1575)


Oozie error log:

 

Stdoutput 2017-02-01 20:57:31,101 INFO [main] sqoop.Sqoop (Sqoop.java:<init>(92)) - Running Sqoop version: 1.4.6-cdh5.8.0
Stdoutput 2017-02-01 20:57:31,113 WARN [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1042)) - Setting your password on the command-line is insecure. Consider using -P instead.
Stdoutput 2017-02-01 20:57:31,304 INFO [main] manager.MySQLManager (MySQLManager.java:initOptionDefaults(71)) - Preparing to use a MySQL streaming resultset.
Stdoutput 2017-02-01 20:57:31,309 INFO [main] tool.CodeGenTool (CodeGenTool.java:generateORM(92)) - Beginning code generation
Stdoutput 2017-02-01 20:57:31,560 INFO [main] manager.SqlManager (SqlManager.java:execute(776)) - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,579 INFO [main] manager.SqlManager (SqlManager.java:execute(776)) - Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
Stdoutput 2017-02-01 20:57:31,582 INFO [main] orm.CompilationManager (CompilationManager.java:findHadoopJars(94)) - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Stdoutput 2017-02-01 20:57:32,587 INFO [main] orm.CompilationManager (CompilationManager.java:jar(330)) - Writing jar file: /tmp/sqoop-yarn/compile/94cbe03d9d51f6ccc47ddd3ca98032be/departments.jar
Stdoutput 2017-02-01 20:57:33,182 INFO [main] tool.ImportTool (ImportTool.java:deleteTargetDir(544)) - Destination directory hdfs://quickstart.cloudera:8020/user/cloudera/workflow/hive_temp is not present, hence not deleting.
Stdoutput 2017-02-01 20:57:33,187 INFO [main] manager.DirectMySQLManager (DirectMySQLManager.java:importTable(83)) - Beginning mysqldump fast path import
Stdoutput 2017-02-01 20:57:33,187 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runImport(242)) - Beginning import of departments
Stdoutput 2017-02-01 20:57:33,188 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
Stdoutput 2017-02-01 20:57:33,203 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.jar is deprecated. Instead, use mapreduce.job.jar
Stdoutput 2017-02-01 20:57:33,210 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
Stdoutput 2017-02-01 20:57:33,253 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at localhost/127.0.0.1:8032
Stdoutput 2017-02-01 20:57:35,040 INFO [main] db.DBInputFormat (DBInputFormat.java:setTxIsolation(192)) - Using read commited transaction isolation
Stdoutput 2017-02-01 20:57:35,072 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(202)) - number of splits:1
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(291)) - Submitting tokens for job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,190 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(293)) - Kind: mapreduce.job, Service: job_1486009475788_0029, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@76f3da25)
Stdoutput 2017-02-01 20:57:35,198 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(293)) - Kind: RM_DELEGATION_TOKEN, Service: 127.0.0.1:8032, Ident: (owner=cloudera, renewer=oozie mr token, realUser=oozie, issueDate=1486011413559, maxDate=1486616213559, sequenceNumber=67, masterKeyId=2)
Stdoutput 2017-02-01 20:57:35,439 INFO [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(260)) - Submitted application application_1486009475788_0032
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job (Job.java:submit(1311)) - The url to track the job: http://quickstart.cloudera:8088/proxy/application_1486009475788_0032/
Stdoutput 2017-02-01 20:57:35,463 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1356)) - Running job: job_1486009475788_0032
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1377)) - Job job_1486009475788_0032 running in uber mode : false
Stdoutput 2017-02-01 20:57:41,569 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1384)) - map 0% reduce 0%
Stdoutput 2017-02-01 20:57:41,682 INFO [main] mapred.ClientServiceDelegate (ClientServiceDelegate.java:getProxy(277)) - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
Stdoutput 2017-02-01 20:57:41,717 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1397)) - Job job_1486009475788_0032 failed with state FAILED due to: 
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(393)) - The MapReduce job has already been retired. Performance
Stdoutput 2017-02-01 20:57:41,725 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(394)) - counters are unavailable. To get this information, 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(395)) - you will need to enable the completed job store on 
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(396)) - the jobtracker with:
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(397)) - mapreduce.jobtracker.persist.jobstatus.active = true
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(398)) - mapreduce.jobtracker.persist.jobstatus.hours = 1
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(399)) - A jobtracker restart is required for these settings
Stdoutput 2017-02-01 20:57:41,726 INFO [main] mapreduce.ImportJobBase (JobBase.java:displayRetiredJobNotice(400)) - to take effect.
Stdoutput 2017-02-01 20:57:41,726 ERROR [main] tool.ImportTool (ImportTool.java:run(631)) - Error during import: Import job failed!
Exit code of the Shell command 1
<<< Invocation of Shell command completed <<<


<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://quickstart.cloudera:8020/user/cloudera/oozie-oozi/0000013-170201202514643-oozie-oozi-W/shell-d3bf--shell/action-data.seq

Oozie Launcher ends
 

 

Explorer
Posts: 14
Registered: ‎02-14-2017

Re: Oozie script action containing Sqoop import failing

Hi @Pythor, I had the same problem, did you find any solutions ?  Thanks. 

New Contributor
Posts: 3
Registered: ‎10-19-2017

Re: Oozie script action containing Sqoop import failing

Hi, I had same problem, I had an sqoop script that executed find when its execute in the server console, but when I tried to execute it with oozie failed.

Announcements