Support Questions

Find answers, ask questions, and share your expertise

Copying data from ACID table to a non-ACID enabled table

avatar
New Contributor

Hello, I have a transactional ACID table that is receiving data from a Spark streaming program. I would like to copy data out of this table into a non-ACID source but my results are inconsistent. Many times, my query will work and I am able to populate the target table, but sometimes I'll run into issues where the map reduce jobs don't launch while waiting for locks. The logs state 'Heart Beat' and then the job times out. What options are there to ensure consistent results?

My code:

set hive.support.concurrency=true;
set hive.execution.engine=mr;
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.compactor.initiator.on=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=false;

CREATE TABLE acid_enabled_table
(foo timestamp)
CLUSTERED BY(foo) INTO 3 BUCKETS
STORED AS ORC TBLPROPERTIES ('transactional'='true');


--My query to copy data from ACID table to non-Acid table
Insert into non_acid_table xyz
select foo from acid_enabled_table
where cond = 'abc';
1 REPLY 1

avatar
Contributor

Have you tried Disabling the ACID for that particular session and then try to run perform the query:-

1- Login to beeline or Hive CLI

2- Disable ACID for the session:-

  1. set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
  2. set hive.support.concurrency=false;

3- Run the Query now to copy the data.