Created 04-15-2020 07:10 AM
Hi Community.
I've created a transactional table on HIVE.
I'm able to perform CRUD operations.
When I try to query the same table from IMPALA, my query returns 0 rows.
I've already invalidated metadata for that table but cannot see any of the existing records.
Where am I wrong?
Created 04-16-2020 01:43 AM
Hi,
transactional tables are not officially supported in CDH. That version if Impala doesn't even know the concept of a transactional table, therefore it doesn't even know that it should reject such tables.
We started to support transactional tables in CDP, but even in CDP Impala the support is limited as the time of writing. You created a so called "full ACID" transactional table. Support for "full ACID" tables is currently under development in Impala. So if you want to read your transactional tables from Impala as well, it's better to create "INSERT-only" transactional tables (with TBLPROPERTIES ('transactional'='true', 'transactional_properties'='insert_only');).
So my advice is not to use transactional tables in CDH, and always check the official documentation of what is supported and what is not. If you want to try out transactional tables you need to use CDP, the newer version the better. But again, note that this functionality is under rapid development in Impala right now.
Created 04-15-2020 10:03 AM
Can you provide more information about your version and how the table was created. Ideally "show create table <table>" output. The answer depends a lot on those things because the transactional table support has evolved a lot in recent versions and there are several variants of transactional tables.
Created 04-16-2020 12:35 AM
Thanks for your answer, below the details.
I am running Hive and Impala on CDH 6.1.1
I've set those parameters:
SET hive.support.concurrency = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
set hive.compactor.initiator.on = true;
set hive.compactor.worker.threads = 1;
set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
Then I've created a sample table:
CREATE TABLE MYDIM (key int, name string, zip string, is_current boolean)
CLUSTERED BY (key) INTO 3 BUCKETS
STORED AS ORC TBLPROPERTIES ('transactional'='true');
I've loaded some data
INSERT INTO MYDIM VALUES
(1, 'bob', '95136', true),
(2, 'joe', '70068', true),
(3, 'steve', '22150', true);
I run a select in HIVE and records are returned
SELECT * FROM MYDIM;
mydim.key mydim.name mydim.zip mydim.is_current
1 bob 95136 true
2 joe 70068 true
3 steve 22150 true
I can update & delete
UPDATE MYDIM SET NAME = 'svasi' WHERE KEY=3;
SELECT * FROM MYDIM;
mydim.key mydim.name mydim.zip mydim.is_current
1 bob 95136 true
2 joe 70068 true
3 svasi 22150 true
Table is available in Impala but no results are returned when I run the select * from mydim
I've tried to invalidate metadata but still no records are read from Impala query
Thanks for your help
Created 04-16-2020 01:43 AM
Hi,
transactional tables are not officially supported in CDH. That version if Impala doesn't even know the concept of a transactional table, therefore it doesn't even know that it should reject such tables.
We started to support transactional tables in CDP, but even in CDP Impala the support is limited as the time of writing. You created a so called "full ACID" transactional table. Support for "full ACID" tables is currently under development in Impala. So if you want to read your transactional tables from Impala as well, it's better to create "INSERT-only" transactional tables (with TBLPROPERTIES ('transactional'='true', 'transactional_properties'='insert_only');).
So my advice is not to use transactional tables in CDH, and always check the official documentation of what is supported and what is not. If you want to try out transactional tables you need to use CDP, the newer version the better. But again, note that this functionality is under rapid development in Impala right now.
Created 04-16-2020 01:47 AM
Thanks Zol.