We are ingesting data from SQL Server (OLTP) to Hive. Incremental updates for updated data not working against SQL Server.
It's still open issue https://issues.apache.org/jira/browse/SQOOP-2421
Is there any work around to get the updated data into Hive tables?
You can obviously just use a manual filter ( WHERE condition ) in Sqoop. For example run sqoop hourly in oozie and get the past hour of data.
As you suggested to use the sqoop saved job in https://community.hortonworks.com/questions/13132/best-practice-to-import-data-from-sql-server-to-hi....
If i use the where condition then i have to supply the value in the where clause manually. I have to create 100 tables for each client. It would be a nightmare to run each job manually by supplying the value.
Do we have any way to do it in friendly way?
What I meant with the where condition is that you could run an oozie job that runs daily and adds the filter parameters automatically.
Hi Benjamin ,
As sqoop incremental mode is not working against the SQL server. Sqoop query clause is the option left. We need to run sqoop query for last date if there are no changes to the historical data in sql server. In case of changes in historical data, query needs to run query for the historical dates where there are changes. Based on the changes in the historical data we need different values of the dates in the where clause of the query.
Do oozie varibles or parameters can handle this situation??