Support Questions
Find answers, ask questions, and share your expertise

Reading a Hive table while doing insert overwrite on a partition

Reading a Hive table while doing insert overwrite on a partition

I am doing an insert overwrite on a partition(static) which takes a few minutes to finish. I want to still query the same table with select command when am doing the insert. But the select query does not start running until the insert query finishes . I know the insert query gets an exclusive lock on just the partition I am writing to .I am trying to figure out why the select query is not able to just read all other partitions on which it can acquire a lock on

6 REPLIES 6

Re: Reading a Hive table while doing insert overwrite on a partition

Cloudera Employee
@viswanath

Can you post your sql statement the select query

Re: Reading a Hive table while doing insert overwrite on a partition

@viswanath

For a partitioned table, the idea is as follows:

A 'S' lock on table and relevant partition is acquired when a read is being performed. For all other operations, an 'X' lock is taken on the partition. However, if the change is only applicable to the newer partitions, a 'S' lock is acquired on the table, whereas if the change is applicable to all partitions, a 'X' lock is acquired on the table. Thus, older partitions can be read and written into, while the newer partitions are being converted to RCFile. Whenever a partition is being locked in any mode, all its parents are locked in 'S' mode.

If you are running 'select' query on the whole of table, then it would wait for the exclusive lock.

For more details, refer to link.

Re: Reading a Hive table while doing insert overwrite on a partition

I am not running the select query on the whole table. I am running select query with where clause filtering a single partition which is not been updated. It still waits for the insert to be finished even if the partition its inserting is not an existing one.

Re: Reading a Hive table while doing insert overwrite on a partition

@viswanath

When you are running the insert statement for a static partition then the lock is obtained over the folder which would be created for the static partition. Now your select query would running over the entire table or the entire folder created for the table which includes all the partition including the one which you are overwriting. In such case it comes to simple file handling. When a file is getting written then it locks the file. The same applies here as well. Hope it helps!!

Re: Reading a Hive table while doing insert overwrite on a partition

I am running a select query with where condition picking the partition which is not been inserted. Now the select query is not supposed to run over all partitions. But still the select query waits for the insert(into static partition) to be finished.

Re: Reading a Hive table while doing insert overwrite on a partition

One more thing.When I issue show locks <table> command. I can only see exclusive locks being shown for insert queries but it is not showing shared locks on select queries ?