Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.
Labels (1)
Guru

Simple illustration of locking in Hive when ACID is enabled

Considerations for illustration

  • Cluster Version: HDP -2.5.6.0
  • Hive Version: Hive 1.2.1000
  • Enabled with following properties in place
    • hive.support.concurrency=true
    • hive.compactor.initiator.on=true
    • hive.compactor.worker.threads=1
    • hive.exec.dynamic.partition.mode=nonstrict
    • hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager (if non-Ambari cluster)

Types of Supported Locks

  • S = SHARED or SHARED_READ
  • X = EXCLUSIVE

Tables used for testing

  • orc_tab (ORC format table with col1 int and col2 string), non-transactional
  • orc_tab_bucketed(ORC format table with col1 int and col2 string, transactional)
  • txt_tab (TEXT format table with col1 int, col2 string, non-transactional, for loading purposes)
  • Either tables have closed to 5 GB data on a Single node cluster



SCENARIO 1 (Non Transactional Table)

  • SELECT blocks ALTER
  • SELECT starts first followed by ALTER

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab ADD COLUMNS (col3 string)"

Session C

+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
|  lockid  | database  |  table   | partition  | lock_state  |  blocked_by   |  lock_type   | transaction_id  | last_heartbeat  |  acquired_at   | user  |        hostname         |                        agent_info                         |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+
| Lock ID  | Database  | Table    | Partition  | State       | Blocked By    | Type         | Transaction ID  | Last Heartbeat  | Acquired At    | User  | Hostname                | Agent Info                                                |
| 31.1     | default   | orc_tab  | NULL       | ACQUIRED    |               | SHARED_READ  | NULL            | 1517003062122   | 1517003062122  | hive  | xlpatch.openstacklocal  | hive_20180126214422_aaeb4b28-5170-4131-b509-ef0213c8b842  |
| 32.1     | default   | orc_tab  | NULL       | WAITING     | 31.1          | EXCLUSIVE    | NULL            | 1517003063314   | NULL           | hive  | xlpatch.openstacklocal  | hive_20180126214422_a65af104-05d1-4c19-ab54-7bb37b4cdbfa  |
+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+



SCENARIO 2 (Non Transactional Table)

  • SELECT blocks INSERT OVERWRITE
  • SELECT starts first followed by INSERT OVERWRITE

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT OVERWRITE TABLE orc_tab SELECT col1,col2 from txt_tab"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 36.1  | default  | orc_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517003567582  | 1517003567582  | hive  | xlpatch.openstacklocal  | hive_20180126215247_7537e30b-d5bf-4fc8-aa23-8e860efe1ac8  || 37.1  | default  | txt_tab  | NULL  | WAITING  |  | SHARED_READ  | NULL  | 1517003568897  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5  || 37.2  | default  | orc_tab  | NULL  | WAITING  | 36.1  | EXCLUSIVE  | NULL  | 1517003568897  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126215248_875685ed-a552-4009-892c-e13c61cf7eb5  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+



SCENARIO 3 (Non Transactional Table)

  • SELECT blocks INSERT
  • SELECT starts first followed by INSERT

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO  orc_tab SELECT col1,col2 from txt_tab limit 20"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 38.1  | default  | orc_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517004119030  | 1517004119030  | hive  | xlpatch.openstacklocal  | hive_20180126220158_775842e7-5e34-42d0-b574-874076fd5204  || 39.1  | default  | txt_tab  | NULL  | WAITING  |  | SHARED_READ  | NULL  | 1517004120971  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224  || 39.2  | default  | orc_tab  | NULL  | WAITING  | 38.1  | EXCLUSIVE  | NULL  | 1517004120971  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126220200_9e9eeb8c-9c32-42fd-8ddf-c96f08699224  |+----------+-----------+----------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.028 seconds)



SCENARIO 4 (Transactional Table)

  • SELECT does not block INSERT
  • SELECT starts first followed by INSERT

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "INSERT INTO orc_tab_bucketed SELECT col1,col2 from txt_tab limit 20"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 42.1  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517004495025  | 1517004495025  | hive  | xlpatch.openstacklocal  | hive_20180126220814_cae3893a-8e97-49eb-8b07-a3a60c4a6dc2  || 43.1  | default  | txt_tab  | NULL  | ACQUIRED  |  | SHARED_READ  | 3  | 0  | 1517004495874  | hive  | xlpatch.openstacklocal  | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9  || 43.2  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | 3  | 0  | 1517004495874  | hive  | xlpatch.openstacklocal  | hive_20180126220815_a335e284-476a-42e0-b758-e181e6ab44e9  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+4 rows selected (0.02 seconds)



SCENARIO 5 (Transactional Table)

  • SELECT does not block INSERT OVERWRITE
  • SELECT starts first followed by INSERT OVERWRITE

Session A

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "select col1 from orc_tab_bucketed order by col1 limit 2"

Session B

beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p '' -e "ALTER TABLE orc_tab_bucketed ADD COLUMNS (col3 string)"

Session C

0: jdbc:hive2://localhost:10000/default> show locks;Getting log thread is interrupted, since query is done!+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+|  lockid  | database  |  table  | partition  | lock_state  |  blocked_by  |  lock_type  | transaction_id  | last_heartbeat  |  acquired_at  | user  |  hostname  |  agent_info  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+| Lock ID  | Database  | Table  | Partition  | State  | Blocked By  | Type  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | Agent Info  || 53.1  | default  | orc_tab_bucketed  | NULL  | ACQUIRED  |  | SHARED_READ  | NULL  | 1517005855005  | 1517005855005  | hive  | xlpatch.openstacklocal  | hive_20180126223053_db2d0054-6cb6-48fb-b732-6ca677007695  || 54.1  | default  | orc_tab_bucketed  | NULL  | WAITING  | 53.1  | EXCLUSIVE  | NULL  | 1517005855870  | NULL  | hive  | xlpatch.openstacklocal  | hive_20180126223054_6294af5a-15da-4178-9a83-40f150e08cb1  |+----------+-----------+-------------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-------------------------+-----------------------------------------------------------+--+3 rows selected (0.064 seconds)

Synopsis

Without "transactional" feature set to true

  • EXCLUSIVE lock (ALTER) waits for SHARED (SELECT)
  • EXCLUSIVE lock (INSERT OVERWRITE) waits for SHARED (SELECT)
  • EXCLUSIVE lock (INSERT) waits for SHARED (SELECT)

With "transactional" enabled

  • EXCLUSIVE lock (ALTER) waits for SHARED (SELECT)
  • INSERT/SELECT both take SHARED lock
6,590 Views
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎01-26-2018 11:01 PM
Updated by:
Contributors
Top Kudoed Authors