I wanted to get the suggestion on the incremental strategy for tables be implemented :
We have set of source table which are getting refreshed on the daily basis in the source ( DB2 )
and we need to refresh then in hive db as well, which approach will you suggest.
Source table have new inserts as well as updates to existing records;
1) approach 1: USe Hbase to store the data since updates are allowed and build hive external table referring to the same
I doubt if this will affect queries using the joins for hive-hbase table with large ORC hive tables?
2) approach 2 : USe 4 step incremental table approach suggested by HDP ?
https://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/