Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

[How?] Hive INSERT OVERWRITE to Transaction Table

avatar
New Contributor

Hi all,

 

I would like to execute "INSERT OVERWRITE" from managed table to Hive Transaction Table which is EMRFS.

 

What I tried was to first create managed table with s3 location (CSV), and executed the following:

 

INSERT OVERWRITE TABLE tobe_tbl
SELECT * FROM asis_tbl LIMIT 10;

 

but I got error as below:

...

Total jobs = 1
Launching Job 1 out of 1
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask

...

 

I noticed that there was an error saying it is writing in Avro not ORC.

 

What commands or configurations could I make to properly INSERT OVERWRITE to Hive Transaction Table?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Please share the complete stack-trace to get better context.  

To perform an INSERT OVERWRITE operation on a Hive ACID transactional table, you need to ensure that you have the right configuration and execute the query correctly. Here are the steps and configurations:

Enable ACID Transactions: Make sure your table is created with ACID properties. You can specify it during table creation like this:

 

CREATE TABLE my_table (
    -- Your table schema here
)
STORED AS ORC
TBLPROPERTIES ('transactional'='true');

 


If your table is not already transactional, you may need to create a new transactional table with the desired schema.

Set Hive ACID Properties: You should set some Hive configuration properties to enable ACID transactions if they are not already set:

 

SET hive.support.concurrency=true;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;  -- Number of compactor threads depending on the number of managed tables and usgae

 

Perform the INSERT OVERWRITE:

Use the INSERT OVERWRITE statement to replace the data in the table:

 

INSERT OVERWRITE TABLE my_table
SELECT ...
FROM ...​

 

Ensure that the SELECT statement fetches the data you want to overwrite with. You can use a WHERE clause or other filters to specify the data you want to replace.

Enable Auto-Compaction : You can enable auto-compaction to periodically clean up small files created by ACID transactions.


REF - 

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/ch02s05s01.html

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/managing-hive/content/hive_acid_operations.htm...

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/managing-hive/content/hive_hive_data_compactio...

 



View solution in original post

1 REPLY 1

avatar
Expert Contributor

Please share the complete stack-trace to get better context.  

To perform an INSERT OVERWRITE operation on a Hive ACID transactional table, you need to ensure that you have the right configuration and execute the query correctly. Here are the steps and configurations:

Enable ACID Transactions: Make sure your table is created with ACID properties. You can specify it during table creation like this:

 

CREATE TABLE my_table (
    -- Your table schema here
)
STORED AS ORC
TBLPROPERTIES ('transactional'='true');

 


If your table is not already transactional, you may need to create a new transactional table with the desired schema.

Set Hive ACID Properties: You should set some Hive configuration properties to enable ACID transactions if they are not already set:

 

SET hive.support.concurrency=true;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;  -- Number of compactor threads depending on the number of managed tables and usgae

 

Perform the INSERT OVERWRITE:

Use the INSERT OVERWRITE statement to replace the data in the table:

 

INSERT OVERWRITE TABLE my_table
SELECT ...
FROM ...​

 

Ensure that the SELECT statement fetches the data you want to overwrite with. You can use a WHERE clause or other filters to specify the data you want to replace.

Enable Auto-Compaction : You can enable auto-compaction to periodically clean up small files created by ACID transactions.


REF - 

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/ch02s05s01.html

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/managing-hive/content/hive_acid_operations.htm...

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/managing-hive/content/hive_hive_data_compactio...