Reply
Contributor
Posts: 37
Registered: ‎02-12-2016

Configure oozie with sqoop

Hi all,

 

I encount a problem with our new cluster on Cloudera 5.7.1.

 

All installation is done, all service are up and running. 

HA is enable for namdenode.

 

We try some job, and it works fine as:

- sqoop import

- oozie workflows with hive job

 

But this one won't to work:

- oozie worflow with sqoop import.

 

My sqoop-metastore is runing on "node1", and oozie server on "node2".

The sqoop-metastore run with init.d from the installation.

 

If I'm on the node2, and do :

sudo -u oozie sqoop job --list

we can seel all stored job in the sqoop-metastore.

 

But we start an simple oozie worflo, which only start a sqoop import, it dosen't work. I get this error:

>>> Invoking Sqoop command line now >>>

3603 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3624 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.7.1
3873 [uber-SubtaskRunner] ERROR org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage  - Cannot restore job: sqoop_job_name
3873 [uber-SubtaskRunner] ERROR org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage  - (No such job)
3873 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.JobTool  - I/O error performing job operation: java.io.IOException: Cannot restore missing job sqoop_job_name

But I don't know hy, cause it works with the command line as oozie user...

 

I try top copy the sqoop-site.xml in the /etc/oozie/conf dir, but it doesn't work.

Only one value is set in the sqoop-site,xml

<property>
    <name>sqoop.metastore.client.autoconnect.url</name>
    <value>jdbc:hsqldb:hsql://host.domain.ltd:16000/sqoop</value>
    <description>Port that this metastore should listen on.</description>
 </property>

 Do I forget something?

Do I have something more to do to allow oozie start sqoop job ?

 

Thanks for you help !

 

Regards,

Cloudera Employee
Posts: 24
Registered: ‎06-17-2016

Re: Configure oozie with sqoop

The  sqoop-site.xml should be added in the <job-xml> tag of the Sqoop action so Oozie can use it.

I hope this helps.

 

Contributor
Posts: 37
Registered: ‎02-12-2016

Re: Configure oozie with sqoop

HI gezapeti,

 

Thanks for you answer, but it doesn't work.

 

We use since the begining the <job-xml> to specifiy the file hive-site.xml.

 

So as you suggest, I try to add this file separated by a comma, but it doesn't work.

So I try to replace the hive-site.xml by the sqoop-site.xml without more success.

 

Here te return:

Sqoop command arguments :
             job
             --exec
             sqoop_job_name
             --
             --target-dir
             /path/to/destination
             --username
             db_user
             --password
             db_password
             --hive-import
             --hive-table
             db.table
Fetching child yarn jobs
tag id : oozie-e6933cd082a6f0d7ab69512a867cbac3
Child yarn jobs are found - 
=================================================================

>>> Invoking Sqoop command line now >>>

3755 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3776 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.7.1
3948 [uber-SubtaskRunner] ERROR org.apache.sqoop.Sqoop  - Got exception running Sqoop: java.lang.IllegalArgumentException: Passed Null for map null
Intercepting System.exit(1)

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

Have you another idea ?

Contributor
Posts: 37
Registered: ‎02-12-2016

Re: Configure oozie with sqoop

Sorry gezapeti, I can't give you an answer. When I make a post, it not publish, without error message... But you configuration doesn't work. I try but have another error...
3755 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3776 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.7.1
3948 [uber-SubtaskRunner] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.IllegalArgumentException: Passed Null for map null
Intercepting System.exit(1)
Posts: 913
Kudos: 110
Solutions: 58
Registered: ‎04-06-2015

Re: Configure oozie with sqoop

Sorry about the posting issues. The posts for some reason ended up in the site's spam filter. I have placed one of them back here for you and removed the duplicates.


Cy Jervis, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Cloudera Employee
Posts: 24
Registered: ‎06-17-2016

Re: Configure oozie with sqoop

You may define multiple <job-xml> tags after each other in Sqoop Action schema 0.3: https://oozie.apache.org/docs/4.2.0/DG_SqoopActionExtension.html#Sqoop_Action_Schema_Version_0.3

 

 

Also, please pass the --verbose option to Sqoop so we may have a more detailed output.

Contributor
Posts: 37
Registered: ‎02-12-2016

Re: Configure oozie with sqoop

Hi gezapeti,

 

I try but it doesn't work anymore.

 

But I find something else.

 

I try in my worlfow to change the sqoop job. I replace it by a simple action as "sqoop job --list"

The worflow run successfully but the output is empty.

 

Now if I keep the same "sqoop job --list" command, but this time I add the "--meta-connect" option with the url of our sqoop-metatsore, I have this error:

 

3557 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3577 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.7.1
3676 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.JobTool  - I/O error performing job operation: java.io.IOException: Exception creating SQL connection
	at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:216)
	at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.open(HsqldbJobStorage.java:161)
	at org.apache.sqoop.tool.JobTool.run(JobTool.java:274)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
	at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
	at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
	at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:388)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:302)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
	at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: socket creation error
	at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
	at org.hsqldb.jdbc.jdbcConnection.<init>(Unknown Source)
	at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
	at org.hsqldb.jdbcDriver.connect(Unknown Source)
	at java.sql.DriverManager.getConnection(DriverManager.java:571)
	at java.sql.DriverManager.getConnection(DriverManager.java:233)
	at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:174)
	... 29 more

Intercepting System.exit(1)

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

As we see, its can come from a missing librairie ou something else. 

It can't find the a driver to connect on sqoop-metastore. 

 

But I find the hsql driver on multiple path, on hdfs and system folder...

 

I thing its to close to find the solution but so far too !!

 

Hope it can give you any idea !

Contributor
Posts: 37
Registered: ‎02-12-2016

Re: Configure oozie with sqoop

So "sqoop job --list" commans works. But not "sqoop job --exec" not...

 

Are my parameters are wrong ? 

 

I try to replace the <commande> by <arg> syntax. Same error.

Contributor
Posts: 37
Registered: ‎02-12-2016

Re: Configure oozie with sqoop

I try to show the the config of one of our job. Same error. Launch since my client workstation.

 

bash: sqoop job --show sqoop_job_name
Warning: /usr/lib/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/10/03 17:36:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.1
16/10/03 17:36:43 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.IllegalArgumentException: Passed Null for map null
java.lang.IllegalArgumentException: Passed Null for map null
        at org.apache.sqoop.util.SqoopJsonUtil.getMapforJsonString(SqoopJsonUtil.java:48)
        at org.apache.sqoop.SqoopOptions.loadProperties(SqoopOptions.java:621)
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.read(HsqldbJobStorage.java:299)
        at org.apache.sqoop.tool.JobTool.showJob(JobTool.java:232)
        at org.apache.sqoop.tool.JobTool.run(JobTool.java:287)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Highlighted
Cloudera Employee
Posts: 24
Registered: ‎06-17-2016

Re: Configure oozie with sqoop

Is the hsql driver added to your workflow? The launcher logs are including all the files in folder sqoop is called from and the hsql driver jar should be there.

If it's missing from there you can add it to the lib/ folder next to your workflow.xml.

Announcements