Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala JDBC executeBatch() doesn't work for Statements.

Solved Go to solution

Impala JDBC executeBatch() doesn't work for Statements.

Explorer

I wrote a java code to insert few thousand records by creating batch, I was not getting good performance from PreparedStatements so I switched to Statements and see if there is some performance difference.

However after getting error and reading Impala doc I came to know executebatch() is not supported for statements but I am able to use addBatch() so its kind a strange.

Here I am posting some of my code-

 

    private static Connection connectViaDS() throws Exception {
        Connection connection = null;

        Class.forName("com.cloudera.impala.jdbc41.DataSource");

        connection = DriverManager.getConnection(CONNECTION_URL);

        return connection;

    }



//inside main--

// I have declared all the variables,I have lots of other data as well so it may create more confusion


        try {
            connection = connectViaDS();

            connection.setAutoCommit(false);

            Statement statement = connection.createStatement();


            Timestamp ed = new Timestamp(e_date);
            Timestamp et = new Timestamp(e_time);

            while(counter <total) {

                for (int index = 1; index <= records; index++) {

                    counter++;

                 String stmt_query = "INSERT INTO tbl_mindarray (source_ip,destination_ip,protocol_number" + index +"," +
                         "source_port" + index + ",destination_port" + index + ",packet,volume" + index + ",duration" + index + "," +
                         "pps" + index + ",bps" + index + ",bpp" + index + ",source_latitude,source_longitude," +
                         "source_city,source_country,destination_latitude,destination_longitude ,destination_city ,destination_country ," +
                         "ingress_volume" + index + " ,egress_volume " + index + ",ingress_packet ,egress_packet " + index + ",source_if_index ,destination_if_index," +
                         "source_host,event_date,event_time,_time,flow,year)" + "VALUES ('s_ip','d_ip',234,23,56,556,34,27,46,389,76," +
                         "'123123d','456456c','Damouli','Nepal','234234d','678678c','Mumbai','India',123,456,786,324,544,23,'192.168.1.44'," +
                         "345" + index +",345" + index +",2010" + index +")";

                    statement.addBatch(stmt_query);

                }
                statement.executeBatch();

                statement.clearBatch();

                //connection.commit();

                System.out.println("Counter = "+counter);

                //flag++;


            }

My question is how do I executebatch() is its not supported, addbatch() is supported so there must be something to execute it.

Here is the error which I am getting

java.sql.SQLException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDataEngine.prepareBatch(Unknown Source)
	at com.cloudera.impala.jdbc.common.SStatement.executeAnyBatch(Unknown Source)
	at com.cloudera.impala.jdbc.common.SStatement.executeBatch(Unknown Source)
	at impala_crt.writeInABatchWithCompiledQuery(impala_crt.java:278)
Caused by: com.cloudera.impala.support.exceptions.GeneralException: Error message not found: NOT_IMPLEMENTED. Can't find resource for bundle java.util.PropertyResourceBundle, key NOT_IMPLEMENTED
	... 4 more


Somewhere I read each prepared statement creates a file into hdfs so I believe statement should be faster as I can insert all row values using just one statement. 

 

It would be really helpful if someone can share there knowledge on this.
Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Impala JDBC executeBatch() doesn't work for Statements.

Cloudera Employee

If you want to use Statement then I think you just use simple: 

 Statement statement = connection.createStatement();
 statement.execute("drop table foo");

 

I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches.

There has not be a lot of emphasis on this in Impala.

As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.

3 REPLIES 3

Re: Impala JDBC executeBatch() doesn't work for Statements.

Cloudera Employee

If you want to use Statement then I think you just use simple: 

 Statement statement = connection.createStatement();
 statement.execute("drop table foo");

 

I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches.

There has not be a lot of emphasis on this in Impala.

As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.

Highlighted

Re: Impala JDBC executeBatch() doesn't work for Statements.

Explorer

Hi @Andrew_Sherman 

 

I have accepted this answer but yet one thing bothers me addbatch() is supported so what is the use of addbatch() if I cannot execute it.

 

Thanks

Re: Impala JDBC executeBatch() doesn't work for Statements.

Cloudera Employee

I agree that is super confusing