Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How to Update records in a Hive table concurrently?

avatar
New Member

I am trying to update two different records of an ACID transaction enabled hive table in two different sessions but getting a lockException showing write conflict. Is there any configuration parameter in Hive.

1 ACCEPTED SOLUTION

avatar

@Harish Nerella

This scenario on updating a table (even two different rows) by two different processes at the same time is not possible at the moment for ACID tables.

Currently, ACID concurrency management mechanism works at a partition level for partitioned tables and table level for non partitioned (which I believe is our case). Basically what the system wants to prevent is 2 parallel transactions updating the same row. Unfortunately, it can't keep track of this at individual row level, it does it at partition and table level respectively.

Refer Jira HIVE-13395 for more details.

View solution in original post

4 REPLIES 4

avatar
New Member

@Harish Nerella

If a table is to be used in ACID writes (insert, update, delete) then the table property "transactional=true" must be set on that table.Also, hive.txn.manager must be set to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager either in hive-site.xml or in the beginning of the session before any query is run.

Refer https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions for more details.

avatar
New Member

Thank you for your answer.

I did set all the ACID Transaction related configurations in both the sessions but still I am getting a write conflict. And caveat is if I am executing the update statement in sequential way (after completion of a query which is running in other session) it is allowing me to update.

avatar

@Harish Nerella

This scenario on updating a table (even two different rows) by two different processes at the same time is not possible at the moment for ACID tables.

Currently, ACID concurrency management mechanism works at a partition level for partitioned tables and table level for non partitioned (which I believe is our case). Basically what the system wants to prevent is 2 parallel transactions updating the same row. Unfortunately, it can't keep track of this at individual row level, it does it at partition and table level respectively.

Refer Jira HIVE-13395 for more details.

avatar
New Member

Thank You. This helped.