Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Guru

Simple illustration of locking in Hive when ACID is enabled

Considerations for illustration

  • Cluster Version: HDP -2.5.6.0
  • Hive Version: Hive 1.2.1000
  • Enabled with following properties in place
    • hive.support.concurrency=true
    • hive.compactor.initiator.on=true
    • hive.compactor.worker.threads=1
    • hive.exec.dynamic.partition.mode=nonstrict
    • hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager (if non-Ambari cluster)

Types of Supported Locks

  • S = SHARED or SHARED_READ
  • X = EXCLUSIVE

Tables used for testing

  • orc_tab (ORC format table with col1 int and col2 string), non-transactional
  • orc_tab_bucketed(ORC format table with col1 int and col2 string, transactional)
  • txt_tab (TEXT format table with col1 int, col2 string, non-transactional, for loading purposes)
  • Either tables have closed to 5 GB data on a Single node cluster



SCENARIO 1 (Non Transactional Table)

  • SELECT blocks ALTER
  • SELECT starts first followed by ALTER

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab ADD COLUMNS (col3 string)"

Session C

+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
|  lockid  | database  |  table   | partition  | lock_state  |  blocked_by   |  lock_type   | transaction_id  | last_heartbeat  |  acquired_at   | user  |        hostname         |                        agent_info                         |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
| Lock ID  | Database  | Table    | Partition  | State       | Blocked By    | Type         | Transaction ID  | Last Heartbeat  | Acquired At    | User  | Hostname                | Agent Info                                                |
| 31.1     | default   | orc_tab  | NULL       | ACQUIRED    |               | SHARED_READ  | NULL            | 1517003062122   | 1517003062122  | hive  | xlpatch.openstacklocal  | hive_20180126214422_aaeb4b28-5170-4131-b509-ef0213c8b842  |
| 32.1     | default   | orc_tab  | NULL       | WAITING     | 31.1          | EXCLUSIVE    | NULL            | 1517003063314   | NULL           | hive  | xlpatch.openstacklocal  | hive_20180126214422_a65af104-05d1-4c19-ab54-7bb37b4cdbfa  |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+



SCENARIO 2 (Non Transactional Table)

  • SELECT blocks INSERT OVERWRITE
  • SELECT starts first followed by INSERT OVERWRITE

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT OVERWRITE TABLE orc_tab SELECT col1,col2 from txt_tab"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 36.1  | default  | orc_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517003567582  | 1517003567582  | hive  | xlpatch.openstacklocal  | hive_20180126215247_7537e30b-d5bf-4fc8-aa23-8e860efe1ac8  || 37.1  | default  | txt_tab  | NULL  | WAITING  |  | SHARED_READ  | NULL  | 1517003568897  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5  || 37.2  | default  | orc_tab  | NULL  | WAITING  | 36.1  | EXCLUSIVE  | NULL  | 1517003568897  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+



SCENARIO 3 (Non Transactional Table)

  • SELECT blocks INSERT
  • SELECT starts first followed by INSERT

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO  orc_tab SELECT col1,col2 from txt_tab limit 20"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 38.1  | default  | orc_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517004119030  | 1517004119030  | hive  | xlpatch.openstacklocal  | hive_20180126220158_775842e7-5e34-42d0-b574-874076fd5204  || 39.1  | default  | txt_tab  | NULL  | WAITING  |  | SHARED_READ  | NULL  | 1517004120971  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224  || 39.2  | default  | orc_tab  | NULL  | WAITING  | 38.1  | EXCLUSIVE  | NULL  | 1517004120971  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.028 seconds)



SCENARIO 4 (Transactional Table)

  • SELECT does not block INSERT
  • SELECT starts first followed by INSERT

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO orc_tab_bucketed SELECT col1,col2 from txt_tab limit 20"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 42.1  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517004495025  | 1517004495025  | hive  | xlpatch.openstacklocal  | hive_20180126220814_cae3893a-8e97-49eb-8b07-a3a60c4a6dc2  || 43.1  | default  | txt_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | 3  | 0  | 1517004495874  | hive  | xlpatch.openstacklocal  | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9  || 43.2  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | 3  | 0  | 1517004495874  | hive  | xlpatch.openstacklocal  | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.02 seconds)



SCENARIO 5 (Transactional Table)

  • SELECT does not block INSERT OVERWRITE
  • SELECT starts first followed by INSERT OVERWRITE

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab_bucketed ADD COLUMNS (col3 string)"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;Getting log thread is interrupted, since query is done!+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 53.1  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517005855005  | 1517005855005  | hive  | xlpatch.openstacklocal  | hive_20180126223053_db2d0054-6cb6-48fb-b732-6ca677007695  || 54.1  | default  | orc_tab_bucketed  | NULL  | WAITING  | 53.1  | EXCLUSIVE  | NULL  | 1517005855870  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126223054_6294af5a-15da-4178-9a83-40f150e08cb1  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+3 rows selected (0.064 seconds)

Synopsis

Without "transactional" feature set to true

  • EXCLUSIVE lock (ALTER) waits for SHARED (SELECT)
  • EXCLUSIVE lock (INSERT OVERWRITE) waits for SHARED (SELECT)
  • EXCLUSIVE lock (INSERT) waits for SHARED (SELECT)

With "transactional" enabled

  • EXCLUSIVE lock (ALTER) waits for SHARED (SELECT)
  • INSERT/SELECT both take SHARED lock
7,567 Views