Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Running an incremental sqoop job on Cloudera using Oozie

Highlighted

Running an incremental sqoop job on Cloudera using Oozie

Explorer

I'm scheduling a sqoop incremental job using a shell script, but it behaves differently from a conventional job, it gives me the following error:

20/07/07 15:38:26 ERROR tool.JobTool: I/O error performing job operation: java.io.IOException: Exception creating SQL connection
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:217)
at org.apache.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:112)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:290)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.sql.SQLException: File input/output error /home/denap0p/.sqoop/metastore.db.properties java.io.FileNotFoundException: /home/denap0p/.sqoop/metastore.db.properties.new (No such file or directory)

 

the script has the following instructions:

sqoop job \
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/denap0p/sqlserver.jceks \
-Dsqoop.export.records.per.statement=1 \
--create ch_job_inc \
-- import \
--connect 'jdbc:jtds:sqlserver://SQLQP002:1433;useNTLMv2=true;domain=JNL_NT;databaseName=TC31Scheduler' \
--username denap0p \
--password-alias sqlserver.password \
--query 'select * from dbo.Job where $CONDITIONS' \
--incremental lastmodified \
--check-column endtime \
--hcatalog-database hpchistperf \
--hcatalog-table ch_job \
--split-by ID \
--merge-key Id \
--hcatalog-storage-stanza 'stored as parquet'

sqoop job --exec ch_job_inc

 

Based on the information I've read, it seems that I need to use a metastore. We are using Cloudera CDH-5.14.2-1.cdh5.14.2.p0.3

 

Any advise is welcome. Thanks in advance.

 

 

3 REPLIES 3
Highlighted

Re: Running an incremental sqoop job on Cloudera using Oozie

Rising Star

Hi Heri,

 

You don't have to use a sqoop metastore, if you don't have a good reason to. If the same user will be executing the jobs for example, there is limited reason to use sqoop metastore. As I don't see you specifying --meta-connection option in your sqoop job, I suspect the sqoop-site.xml has a setting to try and connect to a sqoop metastore. Can you post settings here that start from sqoop.metastore in the XML file?

 

If there is any mention of sqoop.metastore settings in your sqoop-site.xml, I would suggest removing them and trying your script again to see if it works. However, this is assuming you are still at PoC stage and no production workloads are at risk of being messed up by XML change. 

 

 

 

Re: Running an incremental sqoop job on Cloudera using Oozie

Explorer

Thanks for the reply,

 

 I originally didn't know i had to use a meta connection or a metastore until I found that error. So I really don't have a configuration considered yet.

 

 What would you recommend me to do? Is it just matter of adding the --meta-connection parameter?

Highlighted

Re: Running an incremental sqoop job on Cloudera using Oozie

Rising Star

You don't have to use a sqoop metastore. What I recommend is you check your sqoop-site.xml file. Inside of that file look for any sqoop.metastore settings and remove them. That should stop sqoop jobs from trying to access the sqoop metastore.

Don't have an account?
Coming from Hortonworks? Activate your account here