Cloudera Data Analytics (CDA) Articles

VidyaSargur · ‎06-10-2023

Summary

Are you having issues with more queries being handled by a single Impala Coordinator?

Does this eventually lead to OOM scenarios?

Let’s consider you have 3 Impala Coordinators within your cluster and notice that there are queries that skew onto any one of the Impala Coordinators and overwhelm it.

Note how one of the Impala Coordinators in the above example has 73 running queries, and the other 2 have relatively few.

Investigation

Source IP Persistence

To ascertain why any Impala Coordinator can skew the number of running queries that are active on it, look at the way the proxy is set up to handle incoming queries.

‘Source IP Persistence’ means setting up sessions from the same IP address to always go to the same coordinator. This setting is required when setting up high availability with Hue. It is also required to avoid the Hue message ‘results have expired’, which indicates when a query is sent to the cluster on one coordinator but the result doesn’t return to the user via the same coordinator/Hue Server.

Example HAProxy Configuration for Source IP Persistence

The public docs for setting up HAProxy for Impala - Configuring Load Balancer for Impala.

Example setup of Hue-Impala connectivity within /etc/haproxy/haproxy.cfg as follows:

listen impala-hue :21052

mode tcp

stats enable

balance source

timeout connect 5000ms

timeout queue 5000ms

timeout client 3600000ms

timeout server 3600000ms

# Impala Nodes

server impala-coordinator-001.fqdn impala-coordinator-001.fqdn:21050 check

server impala-coordinator-002.fqdn impala-coordinator-002.fqdn:21050 check

server impala-coordinator-003.fqdn impala-coordinator-003.fqdn:21050 check

Now let’s review what can impact the overall connection count into an Impala Coordinator: Hue, Hive & Impala timeout settings.

Example Timeout Settings

The following settings might mimic what you have currently set within your Hue, Hive & Impala services.

Hue

Hive

Impala

Proposed Timeout Settings

Whilst the actual settings will vary cluster by cluster, we recommend moving away from the default settings and setting all of the idle parameters to 2 hours across the board in all 3 services: Hue, Hive & Impala.

This is an initial goal of introducing timeouts whilst monitoring the user experience. The ultimate best practice in this area is to head toward having:

Idle Query Timeouts of 300 seconds (or 5 minutes)
Idle Session Timeouts of 600 seconds (or 10 minutes)

NOTE - all of the parameters being discussed relate to ‘idle’ sessions and queries; in other words, the user has to have left either the session or query in an idle state before the idle parameters will kick in. No active session or query will be captured by this change in the service(s) behavior (s).

Resolution

Hue

Steps to perform:

Go to CM - Hue - Configuration
Search for “Auto Logout Timeout”
Change to 2 hours
Restart Hue Service

Hive

Steps to perform:

Go to CM - Hive - Configuration
Search for “Idle Operation Timeout”
Change to 300 seconds
Search for “Idle Session Timeout”
Change to 600 seconds
Restart Hive Service

Hive on Tez

Steps to perform:

Go to CM - Hive on Tez - Configuration
Search for “Idle Operation Timeout”
Change to 300 seconds
Search for “Idle Session Timeout”
Change to 600 seconds
Restart Hive on Tez Service

Impala

Steps to perform:

Go to CM - Impala - Configuration
Search for “Idle Query Timeout”
Change to 300 seconds
Search for “Idle Session Timeout”
Change to 600 seconds
Restart Impala Service

Cloudera Community

Cloudera Data Analytics (CDA) Articles

Resolve your Impala Coordinator Skew

Apache Impala

Cloudera Hue

Cloudera Manager

Summary

Investigation

Source IP Persistence

Example HAProxy Configuration for Source IP Persistence

Example Timeout Settings

Hue

Hive

Impala

Proposed Timeout Settings

Resolution

Hue

Hive

Hive on Tez

Impala