- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Any disadvantage to enabling concurrency in hive?
- Labels:
-
Apache Hive
Created ‎09-01-2016 05:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've been asked to set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager and hive.support.concurrency = true, because a subset of users is concerned about dirty reads on an external table while an external job runs to consolidate small files within a partition, so they want to do an exclusive lock during the consolidation....
Anyone no of a reason I should be wary of the above settings? Is there potential for performance impacts for other jobs/users that might have had no need for the above settings?
I guess another question would be does "lock table" even work on an external table?
Thx,
-Vince
Created ‎10-27-2016 07:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From what I'm hearing from other sources this answer was inaccurate and totally fails to take into consideration how our cluster is being used. I disagree with it being tagged "best answer".
Created ‎10-28-2016 07:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The two properties hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager and hive.support.concurrency = true are set for ACID tables. External tables cannot be ACID tables as the ACID compactor cannot control the data managed by them.
Created ‎10-31-2016 02:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I thought those 2 settings pre-dated the introduction of ACID tables. I can understand the "External tables cannot be ACID tables..." part, but I would think those settings could be used to allow users to issue an "exclusive lock" on an external table to prevent reading from it thru hive while external jobs manipulate the underlying files....
