Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to do Sqoop Incremental Import through Oozie Coordinator

avatar
Explorer

I want to continuously import data from Postgres to Hive through Sqoop Oozie Coordinator Jobs. In incremental import we need to give the last value to start the import from last value.But in this case we don't know the last value as the data is growing continuously ..Is there any way to automate this process?

Could anyone please help me in resolving the issue

 

20 REPLIES 20

avatar
Expert Contributor

Hi,

 

Can you please install Sqoop gateway on that node.

 

We always recommend to install sqoop gateways on that host on which you are running the Sqoop commands.

 

Link:- https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_sqoop1_client.html

 

Also can you please confirm if you are able to run the normal sqoop import commands from that host?

 

Can you run :- sqoop job --list and share the output of this?

 

You have enough disk space on that host ? Because Sqoop needs to create the local directories as a metastore?

 

Regards

Nitish

 

avatar
Explorer

Hi,

 

We Installed CDH cluster in our environment not the individual components. 

 

We are able to run the normal sqoop commands and got the output also.. the problem is with sqoop saved jobs only.

 

when i ran the sqoop job --list we got the same error:

 

19/08/26 11:17:18 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.16.1
19/08/26 11:17:18 ERROR tool.JobTool: I/O error performing job operation: java.io.IOException: Exception creating SQL connection
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:217)
at org.apache.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:112)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:290)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.sql.SQLException: socket creation error
at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
at org.hsqldb.jdbc.jdbcConnection.<init>(Unknown Source)
at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
at org.hsqldb.jdbcDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:177)
... 8 more

 

we have 400 GB disk space available in that host.

 

Thanks,

Akhila.

avatar
Expert Contributor

Hi,

Did you install the sqoop metastore?

Any configuration changes you made in Sqoop configurations via CM?

 

Regards

Nitish

avatar
Expert Contributor

Also, If you run the same job from other node then it fails with the same message? Can you please confirm?

 

Regards

Nitish

avatar
Explorer

When i tried to run the same job from other node Job is created 

 

What is the problem here with the master node?Job is executed from the slave node..

 

could you please explain me in detail...

 

 

Thanks,

Akhila.

avatar
Expert Contributor

This means that from the master node the job is failed to be created but when you are running the job from slave node then it is getting created.

 

Am I right?

 

If yes then I would request you to check the difference b/w the hosts. Also check if the sqoop gateway has been deployed? If yes then on what node.

 

There is some setup issue in your cluster which is causing this.

 

Regards

Nitish

avatar
Explorer

Yes ,the job is created in Slave node  not in Master node.

 

There is no difference b/w the hosts.

 

Could you please tell me where the problem was?

 

Thanks,

Akhila.

avatar
Expert Contributor

Hi,

 

Not sure at this point what is the issue on that host as I am not able to debug the host related issue.

 

Have you configured Sqoop gateway on both the hosts?

 

Regards

Nitish

avatar
Explorer

Hi,

 

yes,configured Sqoop gateway on both the hosts.

 

please tell me how to run the Sqoop saved jobs in master node itself.

 

Thanks,

Akhila.

avatar
Explorer

Hi,

 

We didn't install it manually . I think in the CDH cluster installation itself sqoop metastore is also included.

 

No we didn't change any sqoop configurations ..

 

When i ran the command sudo -u sqoop sqoop-metastore

 

output:

19/08/26 11:55:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.16.1
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) entered
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) exited
[Server@486f90fe]: [Thread[main,5,main]]: setDatabasePath(0,file:/usr/lib/sqoop/metastore/)
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) entered
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) exited
[Server@486f90fe]: [Thread[main,5,main]]: setDatabaseName(0,sqoop)
[Server@486f90fe]: [Thread[main,5,main]]: putPropertiesFromString(): [hsqldb.write_delay=false]
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) entered
[Server@486f90fe]: [Thread[main,5,main]]: checkRunning(false) exited
[Server@486f90fe]: Initiating startup sequence...
[Server@486f90fe]: Server socket opened successfully in 6 ms.
[Server@486f90fe]: [Thread[HSQLDB Server @486f90fe,5,main]]: Database [index=0db=file:/usr/lib/sqoop/metastore/, alias=sqoop] did not open: org.hsqldb.HsqlException: File input/output error /usr/lib/sqoop/metastore/.properties java.io.FileNotFoundException: /usr/lib/sqoop/metastore/.properties.new (No such file or directory)
[Server@486f90fe]: [Thread[HSQLDB Server @486f90fe,5,main]]: Shutting down because there are no open databases
[Server@486f90fe]: Initiating shutdown sequence...
[Server@486f90fe]: Shutdown sequence completed in 2 ms.
[Server@486f90fe]: 2019-08-26 11:55:48.641 SHUTDOWN : System.exit() was not called
19/08/26 11:55:48 INFO hsqldb.HsqldbMetaStore: Server started on port 16000 with protocol HSQL
19/08/26 11:55:48 INFO hsqldb.HsqldbMetaStore: Got shutdown notification
19/08/26 11:55:48 INFO tool.MetastoreTool: Server thread has quit.

 

Thanks,

Akhila.