Reply
Highlighted
Explorer
Posts: 23
Registered: ‎06-17-2019
Accepted Solution

Impala JDBC executeBatch() doesn't work for Statements.

I wrote a java code to insert few thousand records by creating batch, I was not getting good performance from PreparedStatements so I switched to Statements and see if there is some performance difference.

However after getting error and reading Impala doc I came to know executebatch() is not supported for statements but I am able to use addBatch() so its kind a strange.

Here I am posting some of my code-

 

    private static Connection connectViaDS() throws Exception {
        Connection connection = null;

        Class.forName("com.cloudera.impala.jdbc41.DataSource");

        connection = DriverManager.getConnection(CONNECTION_URL);

        return connection;

    }



//inside main--

// I have declared all the variables,I have lots of other data as well so it may create more confusion


        try {
            connection = connectViaDS();

            connection.setAutoCommit(false);

            Statement statement = connection.createStatement();


            Timestamp ed = new Timestamp(e_date);
            Timestamp et = new Timestamp(e_time);

            while(counter <total) {

                for (int index = 1; index <= records; index++) {

                    counter++;

                 String stmt_query = "INSERT INTO tbl_mindarray (source_ip,destination_ip,protocol_number" + index +"," +
                         "source_port" + index + ",destination_port" + index + ",packet,volume" + index + ",duration" + index + "," +
                         "pps" + index + ",bps" + index + ",bpp" + index + ",source_latitude,source_longitude," +
                         "source_city,source_country,destination_latitude,destination_longitude ,destination_city ,destination_country ," +
                         "ingress_volume" + index + " ,egress_volume " + index + ",ingress_packet ,egress_packet " + index + ",source_if_index ,destination_if_index," +
                         "source_host,event_date,event_time,_time,flow,year)" + "VALUES ('s_ip','d_ip',234,23,56,556,34,27,46,389,76," +
                         "'123123d','456456c','Damouli','Nepal','234234d','678678c','Mumbai','India',123,456,786,324,544,23,'192.168.1.44'," +
                         "345" + index +",345" + index +",2010" + index +")";

                    statement.addBatch(stmt_query);

                }
                statement.executeBatch();

                statement.clearBatch();

                //connection.commit();

                System.out.println("Counter = "+counter);

                //flag++;


            }

My question is how do I executebatch() is its not supported, addbatch() is supported so there must be something to execute it.

Here is the error which I am getting

java.sql.SQLException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDataEngine.prepareBatch(Unknown Source)
	at com.cloudera.impala.jdbc.common.SStatement.executeAnyBatch(Unknown Source)
	at com.cloudera.impala.jdbc.common.SStatement.executeBatch(Unknown Source)
	at impala_crt.writeInABatchWithCompiledQuery(impala_crt.java:278)
Caused by: com.cloudera.impala.support.exceptions.GeneralException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED
	... 4 more


Somewhere I read each prepared statement creates a file into hdfs so I believe statement should be faster as I can insert all row values using just one statement. 

 

It would be really helpful if someone can share there knowledge on this.
Thanks

Cloudera Employee
Posts: 17
Registered: ‎07-25-2018

Re: Impala JDBC executeBatch() doesn't work for Statements.

If you want to use Statement then I think you just use simple: 

 Statement statement = connection.createStatement();
 statement.execute("drop table foo");

 

I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches.

There has not be a lot of emphasis on this in Impala.

As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.

Explorer
Posts: 23
Registered: ‎06-17-2019

Re: Impala JDBC executeBatch() doesn't work for Statements.

Hi @Andrew_Sherman 

 

I have accepted this answer but yet one thing bothers me addbatch() is supported so what is the use of addbatch() if I cannot execute it.

 

Thanks

Cloudera Employee
Posts: 17
Registered: ‎07-25-2018

Re: Impala JDBC executeBatch() doesn't work for Statements.

I agree that is super confusing