Created on 06-18-2019 12:27 AM - edited 09-16-2022 07:27 AM
I wrote a java code to insert few thousand records by creating batch, I was not getting good performance from PreparedStatements so I switched to Statements and see if there is some performance difference.
However after getting error and reading Impala doc I came to know executebatch() is not supported for statements but I am able to use addBatch() so its kind a strange.
Here I am posting some of my code-
private static Connection connectViaDS() throws Exception { Connection connection = null; Class.forName("com.cloudera.impala.jdbc41.DataSource"); connection = DriverManager.getConnection(CONNECTION_URL); return connection; } //inside main-- // I have declared all the variables,I have lots of other data as well so it may create more confusion try { connection = connectViaDS(); connection.setAutoCommit(false); Statement statement = connection.createStatement(); Timestamp ed = new Timestamp(e_date); Timestamp et = new Timestamp(e_time); while(counter <total) { for (int index = 1; index <= records; index++) { counter++; String stmt_query = "INSERT INTO tbl_mindarray (source_ip,destination_ip,protocol_number" + index +"," + "source_port" + index + ",destination_port" + index + ",packet,volume" + index + ",duration" + index + "," + "pps" + index + ",bps" + index + ",bpp" + index + ",source_latitude,source_longitude," + "source_city,source_country,destination_latitude,destination_longitude ,destination_city ,destination_country ," + "ingress_volume" + index + " ,egress_volume " + index + ",ingress_packet ,egress_packet " + index + ",source_if_index ,destination_if_index," + "source_host,event_date,event_time,_time,flow,year)" + "VALUES ('s_ip','d_ip',234,23,56,556,34,27,46,389,76," + "'123123d','456456c','Damouli','Nepal','234234d','678678c','Mumbai','India',123,456,786,324,544,23,'192.168.1.44'," + "345" + index +",345" + index +",2010" + index +")"; statement.addBatch(stmt_query); } statement.executeBatch(); statement.clearBatch(); //connection.commit(); System.out.println("Counter = "+counter); //flag++; }
My question is how do I executebatch() is its not supported, addbatch() is supported so there must be something to execute it.
Here is the error which I am getting
java.sql.SQLException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDataEngine.prepareBatch(Unknown Source) at com.cloudera.impala.jdbc.common.SStatement.executeAnyBatch(Unknown Source) at com.cloudera.impala.jdbc.common.SStatement.executeBatch(Unknown Source) at impala_crt.writeInABatchWithCompiledQuery(impala_crt.java:278) Caused by: com.cloudera.impala.support.exceptions.GeneralException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED ... 4 more
Somewhere I read each prepared statement creates a file into hdfs so I believe statement should be faster as I can insert all row values using just one statement.
It would be really helpful if someone can share there knowledge on this.
Thanks
Created 06-19-2019 04:00 PM
If you want to use Statement then I think you just use simple:
Statement statement = connection.createStatement();
statement.execute("drop table foo");
I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches.
There has not be a lot of emphasis on this in Impala.
As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.
Created 06-19-2019 04:00 PM
If you want to use Statement then I think you just use simple:
Statement statement = connection.createStatement();
statement.execute("drop table foo");
I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches.
There has not be a lot of emphasis on this in Impala.
As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.
Created 06-20-2019 12:36 AM
I have accepted this answer but yet one thing bothers me addbatch() is supported so what is the use of addbatch() if I cannot execute it.
Thanks
Created 06-20-2019 11:15 AM
I agree that is super confusing