Created 09-19-2017 03:31 AM
Hi experts:
I've met a stange question. When I want to use hive jdbc to query a transactional table in my hive database, I can not get any result.
I create a table like this: create table simple4(id string, value string) partitioned by(updatetime date) clustered by (id) into 2 buckets stored as orc location '/apps/hive/warehouse/simple4' TBLPROPERTIES ("transactional"="true");
after that I've use jdbc to query it:
String url = "jdbc:hive2://XXXX:XXX/default"; try { Class.forName("org.apache.hive.jdbc.HiveDriver"); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } Connection conn = DriverManager.getConnection(url,"",""); Statement stmt = conn.createStatement(); stmt.execute("set hive.support.concurrency=true"); stmt.execute("set hive.enforce.bucketing=true"); stmt.execute("set hive.exec.dynamic.partition.mode=nonstrict"); stmt.execute("set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager"); String sql = "select count(*),updatetime from simple4 group by updatetime"; System.out.println("Running "+sql); ResultSet res = stmt.executeQuery(sql);
the url I use is ip address of my hive2 thrift server with port setting in hive.server2.thrift.http.port
with this setting , I can not get any result. but if I create table without TBLPROPERTIES ("transactional"="true") it will get result.
Then I try to use beeline to connect to my hive with beeline -u jdbc:hive2://XXXX:XXXX/, still the same: tablw without transactional property can be queried, but table with transactional return no result.
I use HDP 2.4.2 with Hive 1.2.1. Setting like "ACID Transactions", "hive.enforce.bucketing" etc, I've followed instruction on https://zh.hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/#h... and https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration.
Please suggest how can i figure out what's wrong with my setting? Really appreciate.
Created 09-19-2017 03:49 AM
BTW, I've cheked logs of hive and spark beeline, nothing related error founded
Created 09-19-2017 04:36 AM
Have you setup your hive server2 for transactions? If not, you need to setup HS2 to enable transactions.
Created on 09-19-2017 05:52 AM - edited 08-18-2019 12:38 AM
@mqureshi Thanks a lot for your response.
Yes I've checked it before and do setting accordingly, you can see my setting below:
besides, I've also create a hive-site.xml under my /usr/hdp/current/spark2/conf and copy items as below:
however still no luck, not sure where to find out issues
Created 09-19-2017 05:55 AM
I've create hive-site.xml under spark to make sure beeline can work, however seems it doesnot
Created 09-23-2017 03:51 PM
Hive ACID and transactions were beta (Tech Preview) in HDP 2.4.2. Is it possible for you to upgrade to HDP 2.6.1 and do your tests?
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_HDP_RelNotes/content/tech_previews.html
Created 09-25-2017 08:16 AM
This environment is our production environment and currently I can not update it . But I'll do some test in a sandbox or some other test environment and let you know the result. Thanks a lot.
Created 09-19-2017 07:43 AM
BTW, I notice that if I use beeline insert a record in a transactional table, I can query it out, if I use hive insert a record directly after, I can query both records inserted with hive&beeline in hive, but beeline can only query records that been inserted in beeline, but not records inserted in hive
Created 09-20-2017 05:08 PM
How did you populate the data from simple4 table? As you mentioned that you can query the table without TBLPROPERTIES ("transactional"="true") tells me that the orc data currently in the table is non transactional. Note that reading/writing to an ACID table from a non-ACID session is not allowed. So, just adding the "transactional"="true" to the table properties does not change a non-transactional table to transactional. Please refer to Hive Transactions Wiki for more information.
If you want to convert an existing non-transactional table to transactional (aka ACID) table, you can do something like the following in an ACID session
set ...; insert into table simple4 partition (updatetime) select id, value, updatetime from <simple4_nontransactional_table>;
Created 09-22-2017 08:39 AM
I populates simple4 by flume, and flume get data from kafka where I put my data in.
Current situation is: if I populate the table with flume combine with HiveSink (sink type is hive in flume sink config), I can not read these data from beeline(which should use hive jdbc I think) or SparkSQL program I write
if I use beeline or my SparkSQL program insert data in it, I can query out this part of data in beeline
So I'm confused why data in same transacational table can be distinguished in such a wired way