Reply
Cloudera Employee
Posts: 368
Registered: ‎07-29-2015

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

In Impala 2.11 we actually capped the max batch_size setting. Before that you could set it to an arbitrarily high value, which could have strange consequences. It's still a bit of a use-at-your-own-risk setting since it can have consequences for memory consumption and performance.

 

The real fix for this would be https://issues.apache.org/jira/browse/IMPALA-1618. Setting batch_size is just a workaround that may or may not work for you.

Explorer
Posts: 11
Registered: ‎09-20-2018

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

Can you tell me the way to set the BATCH_SIZE for impala jdbc connection? I tried but it is not working for me.

Expert Contributor
Posts: 125
Registered: ‎07-17-2017

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

Hi @Bishnup

ConfiguringServer-SideProperties
When connecting to a server that is running Impala 2.0 or later, you can use the driver to apply configuration properties to the server by setting the properties in the connection URL.
https://www.cloudera.com/documentation/other/connectors/impala-jdbc/latest/Cloudera-JDBC-Driver-for-...

Good luck.

Explorer
Posts: 11
Registered: ‎09-20-2018

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

Hi @AcharkiMed

 

I tried setting the Batch size in the connection URL but I didn't get any performance boost in the query fetching time. I have posted my usecase in the cloudera forum. Kindly answer my questions :

 

 

Expert Contributor
Posts: 125
Registered: ‎07-17-2017

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

Hi,

Please try to change all these 3 params:

TSaslTransportBufSize=4000;
RowsFetchedPerBlock=60536;
SSP_BATCH_SIZE=60536;
Highlighted
Explorer
Posts: 11
Registered: ‎09-20-2018

Re: Impala ODBC/JDBC bad performance - rows fetch is very slow from a remote server compared with NN

[ Edited ]

Hi @AcharkiMed

 

As you suggested me to set 

TSaslTransportBufSize=4000;
RowsFetchedPerBlock=60536;
SSP_BATCH_SIZE=60536;

in the connection URL. I did the changes but i am getting these errors

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/ statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:H Y000, errorMessage:Invalid query option: SSP_BATCH_SIZE
), Query: SET SSP_BATCH_SIZE=60536.
        at com.cloudera.hivecommon.api.HS2Client.executeStatementInternal(Unknow n Source) ~[Impala-JDBC-41-1.0.0.jar!/:na]

 and 

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:Invalid query option: TSaslTransportBufSize
), Query: SET TSaslTransportBufSize=4000.

Help me set up the property.

 

Thank You,

Bishnu

Announcements