Support Questions

Find answers, ask questions, and share your expertise

when use hive jdbc , can not query transactional table

Explorer

Hi experts:

I've met a stange question. When I want to use hive jdbc to query a transactional table in my hive database, I can not get any result.

I create a table like this: create table simple4(id string, value string) partitioned by(updatetime date) clustered by (id) into 2 buckets stored as orc location '/apps/hive/warehouse/simple4' TBLPROPERTIES ("transactional"="true");

after that I've use jdbc to query it:

String url = "jdbc:hive2://XXXX:XXX/default";  
	        try {  
	            Class.forName("org.apache.hive.jdbc.HiveDriver");  
	        } catch (ClassNotFoundException e) {  
	            // TODO Auto-generated catch block  
	            e.printStackTrace();  
	        }  
	        Connection conn = DriverManager.getConnection(url,"","");  
	        Statement stmt = conn.createStatement();  
	        stmt.execute("set hive.support.concurrency=true");
	        stmt.execute("set hive.enforce.bucketing=true");
	        stmt.execute("set hive.exec.dynamic.partition.mode=nonstrict");
	        stmt.execute("set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
	        String sql = "select count(*),updatetime from simple4 group by updatetime";
	        System.out.println("Running  "+sql);  
	        ResultSet res = stmt.executeQuery(sql); 
 

the url I use is ip address of my hive2 thrift server with port setting in hive.server2.thrift.http.port

with this setting , I can not get any result. but if I create table without TBLPROPERTIES ("transactional"="true") it will get result.

Then I try to use beeline to connect to my hive with beeline -u jdbc:hive2://XXXX:XXXX/, still the same: tablw without transactional property can be queried, but table with transactional return no result.

I use HDP 2.4.2 with Hive 1.2.1. Setting like "ACID Transactions", "hive.enforce.bucketing" etc, I've followed instruction on https://zh.hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/#h... and https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration.

Please suggest how can i figure out what's wrong with my setting? Really appreciate.

11 REPLIES 11

Explorer

BTW, I've cheked logs of hive and spark beeline, nothing related error founded

Super Guru
@dustin wang

Have you setup your hive server2 for transactions? If not, you need to setup HS2 to enable transactions.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_dataintegration/content/config-hs2-for-a...

Explorer

@mqureshi Thanks a lot for your response.

Yes I've checked it before and do setting accordingly, you can see my setting below:

39421-p1.png

besides, I've also create a hive-site.xml under my /usr/hdp/current/spark2/conf and copy items as below:

39420-p2.png

however still no luck, not sure where to find out issues

Explorer

I've create hive-site.xml under spark to make sure beeline can work, however seems it doesnot

Super Guru

@dustin wang

Hive ACID and transactions were beta (Tech Preview) in HDP 2.4.2. Is it possible for you to upgrade to HDP 2.6.1 and do your tests?

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_HDP_RelNotes/content/tech_previews.html

Explorer
@mqureshi

This environment is our production environment and currently I can not update it . But I'll do some test in a sandbox or some other test environment and let you know the result. Thanks a lot.

Explorer

BTW, I notice that if I use beeline insert a record in a transactional table, I can query it out, if I use hive insert a record directly after, I can query both records inserted with hive&beeline in hive, but beeline can only query records that been inserted in beeline, but not records inserted in hive

Master Collaborator

How did you populate the data from simple4 table? As you mentioned that you can query the table without TBLPROPERTIES ("transactional"="true") tells me that the orc data currently in the table is non transactional. Note that reading/writing to an ACID table from a non-ACID session is not allowed. So, just adding the "transactional"="true" to the table properties does not change a non-transactional table to transactional. Please refer to Hive Transactions Wiki for more information.

If you want to convert an existing non-transactional table to transactional (aka ACID) table, you can do something like the following in an ACID session

set ...;
insert into table simple4 partition (updatetime) select id, value, updatetime from <simple4_nontransactional_table>;

Explorer

@Deepesh

I populates simple4 by flume, and flume get data from kafka where I put my data in.

Current situation is: if I populate the table with flume combine with HiveSink (sink type is hive in flume sink config), I can not read these data from beeline(which should use hive jdbc I think) or SparkSQL program I write

if I use beeline or my SparkSQL program insert data in it, I can query out this part of data in beeline

So I'm confused why data in same transacational table can be distinguished in such a wired way

Master Collaborator

If Flume was used to populate Hive (througn HiveSink) then the data should have been written properly. Can you check the Flume log to see if the data was successfully getting written to Hive? Note that ACID needs to be enabled on Hive Metastore before streaming data from Flume into Hive.

Explorer

@Deepesh. Thanks a lot for your quick response. I think data been written properly, as I can use hive CLI to query all data in my table simple4, no matter data written by flume or by SparkSQL program. The problem is when I use SparkSQL or beeline CLI to query data, only data written by SparkSQL or inserted in beeline, not those been written by Flume.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.