Member since
09-20-2018
11
Posts
0
Kudos Received
0
Solutions
10-16-2018
10:24 PM
Hi @Burle can you suggest me some load balancer that can check Impala health or available resources and based on that it distributes the load to the Impala nodes. Regards, Bishnu
... View more
10-10-2018
11:43 PM
Hi @EricL Thank you for your response. I want to know how impala distribute the resources if it has multiple concurrent queries to execute. In my case, I am facing a slow down in the query performance in case of concurrent execution. If there any property or something that i need to do to get high concurrent performance from Impala. Regards, Bishnu
... View more
10-10-2018
11:37 PM
Hi @BikramjeetVig I have tried setting the mem_limit only to the Impala conf file. But I didn't find any performance boost in the concurrency performance.
... View more
10-01-2018
05:19 AM
Hi @AcharkiMed As you suggested me to set TSaslTransportBufSize=4000; RowsFetchedPerBlock=60536; SSP_BATCH_SIZE=60536; in the connection URL. I did the changes but i am getting these errors java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/ statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:H Y000, errorMessage:Invalid query option: SSP_BATCH_SIZE
), Query: SET SSP_BATCH_SIZE=60536.
at com.cloudera.hivecommon.api.HS2Client.executeStatementInternal(Unknow n Source) ~[Impala-JDBC-41-1.0.0.jar!/:na] and java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:Invalid query option: TSaslTransportBufSize
), Query: SET TSaslTransportBufSize=4000. Help me set up the property. Thank You, Bishnu
... View more
10-01-2018
02:30 AM
Hi @AcharkiMed I tried setting the Batch size in the connection URL but I didn't get any performance boost in the query fetching time. I have posted my usecase in the cloudera forum. Kindly answer my questions : https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-concurrent-query-delay/m-p/80228#M4911 https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Understanding-How-does-impala-handles-concurrent-queries/m-p/80296#M4921
... View more
09-27-2018
11:29 PM
Can you tell me the way to set the BATCH_SIZE for impala jdbc connection? I tried but it is not working for me.
... View more
09-25-2018
07:25 AM
My cluster configuration is as follows: 3 Node cluster 128GB RAM per cluster node. Processor: 16 core HyperThreaded per cluster node. My questions are: 1) If I submit 5 queries how does impala executes them concurrently? (How does Impala distribute the load in the cluster?) 2) If the number of queries is increased above 10 How will the load balancing of query execution happen? I am observing an unexpected behaviour of impala if I am trying to execute concurrent queries at a time. So just want to understand the distribution of queries and resources in Impala?
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
-
Cloudera Manager
09-24-2018
11:05 AM
Then how to share resources so that each query get equal CPU & Memory?
... View more
09-24-2018
03:52 AM
Q) Are you load balancing the queries across Impala Daemons. Ans) How do I load balance the queries across impala demons? Q) If just two ID are working on the query it means you are running queries on small data (i.e. blocks are just on two nodes). What kind of queries are you running (are there just scans, or broadcasts?) Ans) The queries contains multi-joins to various tables. I've tried giving a bigger query which takes around 10-15sec but still the query is not going to that specific node, Is there any way to check why it is not distributing the load to that specific node? According to Cloudera documentation: Only accepts the values 0 (meaning all nodes) or 1 (meaning all work is done on the coordinator node). Check the documentation here NUM_NODES. Even after setting the NUM_NODES to 1 for that specific node, the query still it goes to any one of the other nodes.
... View more
09-20-2018
11:20 PM
My cluster configuration is as follows: 3 Node cluster 128GB RAM per cluster node. Processor: 16 core HyperThreaded per cluster node. All 3 nodes have Kudu master and T-Server and Impala server, one of the node has Impala catalogue and Impala StateStore. My issues are as follows: 1) I've a hard time figuring out Dynamic resource pooling in impala while running concurrent queries. I've tried giving mem_limit still no luck. I've also tried static service pool but with that also I couldn't achieve required concurrency. Even with admission control, the required concurrency was not achieved. I) The time taken for 1 query: 500-800ms. II) But if 10 concurrent queries are given the time taken grows to 3-6s per query. III) But if more than 20 concurrent queries are given the time taken is exceeding 10s per query. 2) One of my cluster nodes is not taking the load after submitting the query, I checked this by the summary of the query. I've tried giving the NUM_NODES as 0 and 1 on the node which is not taking the load, still, the summary shows that the node is not taking the load.
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu