In Hive i have a staging table 'Stg_emp' . Below is sample records for stg_emp table.
Empid Ename Last_date
----- ----- ---------
1 Eliza 19/02/2018 10:45:02
2 Mike 18/02/2018 09:45:04
3 James 18/02/2018 08:25:01
For this staging table, i will get newly inserted rows through Sqoop_job1 and Updated rows through Sqoop_job2
Sqoop_job1
----------
sqoop job --create sqoop_job1 --import --connect jdbc:oracle:thin:@10.11.111.111:1222:DEV --username sq_user --password sq_us --table Stg_emp --incremental append
-check-column empid --target-dir /user/sq_sample/stg_emp -m 1 --merge-key empid
Sqoop_job2
----------
sqoop job --create sqoop_job2 --import --connect jdbc:oracle:thin:@10.11.111.111:1222:DEV --username sq_user --password sq_us --table Stg_emp --incremental lastmodified
-check-column last_date --target-dir /user/sq_sample/stg_emp -m 1 --merge-key empid
My Questions on above sample table.
1. If row contains Empid 3 is deleted then how will i sink those deletions in hive.
2. If i want to move only newly inserted rows of stg_emp into another hive table , how to do that
3. If i want to move only updated rows of stg_emp into another hive table, how to do that.
Please correct me if my sqoop jobs are wrong.