Cloudera Data Analytics (CDA) Articles

Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Cloudera Employee

Summary

Are you having issues with more queries being handled by a single Impala Coordinator?

Does this eventually lead to OOM scenarios?

Let’s consider you have 3 Impala Coordinators within your cluster and notice that there are queries that skew onto any one of the Impala Coordinators and overwhelm it.

 

MichaelBush_0-1686391791114.png

Note how one of the Impala Coordinators in the above example has 73 running queries, and the other 2 have relatively few.

Investigation

Source IP Persistence

To ascertain why any Impala Coordinator can skew the number of running queries that are active on it, look at the way the proxy is set up to handle incoming queries.

 

‘Source IP Persistence’ means setting up sessions from the same IP address to always go to the same coordinator. This setting is required when setting up high availability with Hue. It is also required to avoid the Hue message ‘results have expired’, which indicates when a query is sent to the cluster on one coordinator but the result doesn’t return to the user via the same coordinator/Hue Server.

Example HAProxy Configuration for Source IP Persistence

The public docs for setting up HAProxy for Impala - Configuring Load Balancer for Impala. 

Example setup of Hue-Impala connectivity within /etc/haproxy/haproxy.cfg as follows:

listen impala-hue :21052

    mode tcp

    stats enable

    balance source


    timeout connect 5000ms

    timeout queue   5000ms

    timeout client  3600000ms

    timeout server  3600000ms


    # Impala Nodes

    server impala-coordinator-001.fqdn impala-coordinator-001.fqdn:21050 check

    server impala-coordinator-002.fqdn impala-coordinator-002.fqdn:21050 check

    server impala-coordinator-003.fqdn impala-coordinator-003.fqdn:21050 check

 

Now let’s review what can impact the overall connection count into an Impala Coordinator: Hue, Hive & Impala timeout settings.

Example Timeout Settings

The following settings might mimic what you have currently set within your Hue, Hive & Impala services.

Hue

MichaelBush_1-1686391791094.png

 

Hive

MichaelBush_2-1686391791104.png

 

Impala

MichaelBush_3-1686391791099.png

 

Proposed Timeout Settings

Whilst the actual settings will vary cluster by cluster, we recommend moving away from the default settings and setting all of the idle parameters to 2 hours across the board in all 3 services: Hue, Hive & Impala.

 

This is an initial goal of introducing timeouts whilst monitoring the user experience.  The ultimate best practice in this area is to head toward having:

  • Idle Query Timeouts of 300 seconds (or 5 minutes)
  • Idle Session Timeouts of 600 seconds (or 10 minutes)

NOTE - all of the parameters being discussed relate to ‘idle’ sessions and queries; in other words, the user has to have left either the session or query in an idle state before the idle parameters will kick in. No active session or query will be captured by this change in the service(s) behavior (s).

Resolution

Hue

Steps to perform:

  • Go to CM - Hue - Configuration
  • Search for “Auto Logout Timeout”
  • Change to 2 hours
  • Restart Hue Service

Hive

Steps to perform:

  • Go to CM - Hive - Configuration
  • Search for “Idle Operation Timeout”
  • Change to 300 seconds
  • Search for “Idle Session Timeout”
  • Change to 600 seconds
  • Restart Hive Service

Hive on Tez

Steps to perform:

  • Go to CM - Hive on Tez - Configuration
  • Search for “Idle Operation Timeout”
  • Change to 300 seconds
  • Search for “Idle Session Timeout”
  • Change to 600 seconds
  • Restart Hive on Tez Service

Impala

Steps to perform:

  • Go to CM - Impala - Configuration
  • Search for “Idle Query Timeout”
  • Change to 300 seconds
  • Search for “Idle Session Timeout”
  • Change to 600 seconds
  • Restart Impala Service
386 Views
0 Kudos