- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
hive incremental updates
- Labels:
-
Apache HBase
-
Apache Hive
Created ‎07-03-2017 01:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wanted to get the suggestion on the incremental strategy for tables be implemented : We have set of source table which are getting refreshed on the daily basis in the source ( DB2 ) and we need to refresh then in hive db as well, which approach will you suggest.
Source table have new inserts as well as updates to existing records;
1) approach 1: USe Hbase to store the data since updates are allowed and build hive external table referring to the same I doubt if this will affect queries using the joins for hive-hbase table with large ORC hive tables?
2) approach 2 : USe 4 step incremental table approach suggested by HDP ? https://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
Created ‎07-03-2017 04:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Abhijeet Rajput,
Previous to HDP 2.6 you'll need to use the solution outlined in #2. HDP 2.6 includes Hive MERGE so you can now create a staging table and execute a MERGE statement against an ACID enabled table. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Merge
Created ‎07-03-2017 04:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Abhijeet Rajput,
Previous to HDP 2.6 you'll need to use the solution outlined in #2. HDP 2.6 includes Hive MERGE so you can now create a staging table and execute a MERGE statement against an ACID enabled table. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Merge
