Support Questions
Find answers, ask questions, and share your expertise

Mysql to Hive with incremental column

Explorer

I am trying to learn some basic things in sqoop and I want to insert some data from a mysql table  into hive. This Mysql table takes data every 5 mins. I found how to create sqoop job in order to connect and run the query but I can not understand how the sqoop will know the last-value from the primary key column in order to extract the newer data every time. 

 

https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports

 

For example in the below sqoop command do I have to put the last value or the sqoop can understand it from its own?

The check-column must be the primary key column?

 

sqoop job --create <JOBS NAME>\
--import \
--connect "jdbc:<PATH>" \
--username <USERNAME> \
--password <PASSWORD> \
--target-dir <DIR> \
--table <MYSQL TABLE>\
--hive-import \
--hive-table <HIVE TABLE>\
--fields-terminated-by , \
--escaped-by \\ \
--split-by <COLUMN TO BE SPLITED IN MAPPERS> \
--num-mappers -5 \
--incremental append \
--check-column \
--last-value

 

 

0 REPLIES 0