Support Questions
Find answers, ask questions, and share your expertise

Does HCatalog supports incremental import with dynamic partition in target table in Sqoop ?

New Contributor

If I execute below command, it works fine -

sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/test \ --driver com.mysql.jdbc.Driver \ --username xxxx --password xxxx \ --query 'select execution_id, project_id, project_name, flow_name, job_id, start_time, end_time, update_time FROM metrics WHERE $CONDITIONS'  \ --split-by project_id \ --hcatalog-table metrics;

But when I include incremental parameters

sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/test \
--driver com.mysql.jdbc.Driver \
--username xxxx --password xxxx \
--query 'select execution_id, project_id, project_name, flow_name, job_id, start_time, end_time, update_time FROM metrics WHERE $CONDITIONS' \
--check-column update_time \
--incremental lastmodified \
--last-value 0 \
--split-by project_id \
--hcatalog-table metrics;

It gives below Error message -

ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.hadoop.fs.FileSystem.fixRelativePart(FileSystem.java:2207)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1310)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.sqoop.tool.ImportTool.initIncrementalConstraints(ImportTool.java:320)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
3 REPLIES 3

Master Collaborator

Can you try the suggestion from this thread, it suggests running this as a generic Sqoop job rather than import job?

https://community.hortonworks.com/questions/58015/sqoop-hcataloghive-incremental-import-in-orc-forma...

New Contributor

Already tried that, does not work for incremental.

I solved the incremental import with dynamic partition in a daily log table.

I created a partition by YEAR/MONTH/DAY:

# hcat -e "create table history.monitor (idvalitemcol bigint, dataregister string) partitioned by (year string, month string, day string) stored as rcfile;"

Load data:

# sqoop import --hive-drop-import-delims -m 4 --driver com.ibm.db2.jcc.DB2Driver --connect jdbc:db2://xxxx --username xxxx --password xxxx --query "select idvaltemcol, dataregister, year(dataregister) as year,month(dataregister) as month,day(dataregister) as day from MNT.LOGMONITOR WHERE \$CONDITIONS AND DATAREGISTER = '2016-11-06'" --split-by DATAREGISTER --hcatalog-database historty  --hcatalog-table monitor

It only works for complete days (the small key of partition).

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.