Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

delete query from hive table (with partition) not working

avatar

I am trying to delete some of the rows from my hive table which has partitions. This is what I did.

delete from <table_name> where <condition>;

However, I am getting following error.

FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

Please anyone suggest why the query is not working.

Thanks in advance.

4 REPLIES 4

avatar

@Saurav Ranjit

Steps!!!!

1.Need to enable ACID Property

2. table need to be in ORC format and bucket enabled.

Check table definition using "SHOW CREATE TABLE", to know if it satisfies above conditions.

then delete.

avatar

Hi,

Table is stored as TEXTFILE. It is not in ORC format or bucket enable. For such case is there a way around to delete the rows from such tables

avatar
Master Guru

@Saurav Ranjit

If your table is text format then the table won't have any delete/update capabilities.

The work around for this case as follows,

If your table is partitioned:
1.Then select the partition that you want to delete rows from and make sure any new data is not writing into this partition.
2.Take the specific partition data into temp table

hive> create table <db_name>.<temp_table_name> as select * from <db_name>.<partition_table_name> where <partition_field_name>="<desired_partition_value>";

3.Overwrite the same partition by excluding the unnecessary rows

hive> insert overwrite  <db_name>.<partition_table_name> partition(<partition_field_name>) select * from <db_name>.<temp_table_name> where <field_name> not in (<values_to_exclude>);

4. once you make sure that the data is correct then drop the temp table

hive> drop table <db_name>.<temp_table_name>;

These are the steps we need to follow for deleting specific rows in case of non-transactional table.

In addition if you are having non partitioned table then we need to get full dump of existing(target) table into temp table and overwrite the target table by excluding the unnecessary rows from the temp table and most important until this process is finished make sure you are not writing any new data into target table.

-

Even hive supports select and overwrite the same table at same time but any wrong queries will lead to loose data completely so it's better to use temp table in place and drop the table when we make sure the data is correct.

Example:

insert overwrite table <db_name>.<partition_table_name> partition(<partition_field_name>) select * from <db_name>.<partition_table_name> where <field_name> not in (<values_to_exclude>);

avatar

Thank you very much for the explanation regarding the work around to delete the row.