Member since
03-02-2016
3
Posts
5
Kudos Received
0
Solutions
03-03-2016
04:59 PM
1 Kudo
Thanks @Shishir Saxena and @Predrag Minovic @Predrag Minovic Thank you but if I don't use any append or merge-key its throwing error as below With the below job definition sqoop is able to pick the date dynamically hopefully this should not be a problem? ERROR tool.ImportTool: Error during import: --merge-key or --append is required when using --incremental lastmodified and the output directory exists. sqoop job --create SRs-- import --connect "jdbc:sqlserver://localhost:1431;database=xx;username=root;password=welcome1" --query 'select * from sr_table where $CONDITIONS' --target-dir /data/sr/sr_table --append --check-column LAST_UPD --incremental lastmodified --last-value '2016-03-03 15:56:47.92' --split-by ROW_ID
... View more
03-03-2016
01:26 AM
1 Kudo
@Shishir , Thanks for your reply ... 1) Saving the job with 1900-01-01 (first time) 2) Save the same job with last-value should solve the problem (modifying same job)? sqoop job --create sr_table --import--connect "jdbc:sqlserver://localhost:1431;database=test;username=root;password=welcome1"--query 'select * from sr_table where $CONDITIONS'--target-dir /data/sr/sr_table --append --check-column SR_LAST_UPD --incremental lastmodified --last-value last-value --split-by ROW_ID
... View more
03-02-2016
09:07 PM
3 Kudos
Hi, I have table Service Requests (SRs) which is having 10k records as of today trying to implement incremental logic so that when i run tomorrow need to capture only newly inserted and updated records Table structure : row_id (unique), SR# , Sr_owner,sr_group,sr_last_update_date (timestamp) Approache : 1) created sqoop job as below sqoop job --create sr_table -- import --connect "jdbc:sqlserver://localhost:1431;database=test;username=root;password=welcome1" --query 'select * from sr_table where $CONDITIONS' --target-dir /data/sr/sr_table --append --check-column SR_LAST_UPD --incremental lastmodified --last-value '1900-01-01' --split-by ROW_ID 2) While running this for the first time its working fine (Full data or full load) 3) If I re-execute it again its not comparing he Upper Bound value with last run time (SR_LAST_UPD) Was able to follow this (https://github.com/abajwa-hw/single-view-demo/blob/master/singleview-mysql-advanced-23.md) I'm looking for more dynamic way of comparing the sqoop job execution time to compare with sr_last_upd value Thanks -
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Sqoop