- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive Compaction for ACID Transactions
- Labels:
-
Apache Hive
Created ‎02-07-2016 09:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Hive SME's,
I am setting up Hive tables for ACID Transactions. There are fair(not high) number of inserts/ updates expected on each tables.
Should the compactions be scheduled every day? or let Hive manage compaction? Are there any pros/cons on hive managed compactions?
Hive Version 0.14
Thank You
Pranay Vyas
Created ‎02-07-2016 09:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please see this before you do this in prod
https://community.hortonworks.com/content/kbentry/4321/hive-acid-current-state.html
Created ‎02-07-2016 09:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please see this before you do this in prod
https://community.hortonworks.com/content/kbentry/4321/hive-acid-current-state.html
Created ‎02-07-2016 09:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Neeraj Sabharwal
Created ‎02-08-2016 07:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How often to run compaction is a function of how quickly you are generating delta files (see https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-BasicDesignfor more info). Less frequent compactions will make reads more expensive. Keep in mind that this system is designed for slowly changing data. Updating 1 row out of 1 billion row table every second will not work well. The cost of executing an SQL UPDATE statement that matches 1 row and 10K rows is roughly the same.
The other response regarding the state of this feature in Hive 0.14 is still valid.
Created ‎02-10-2016 04:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank You @Eugene Koifman. this helps.
