Support Questions

Find answers, ask questions, and share your expertise

Monitoring Disk-to-spill from Cloudera Manager

avatar
Explorer

Hello Team,

 

As per Impala release notes for Impala 2.5, 

 

+++++
Spill-to-disk feature now always recommended. In earlier releases, the spill-to-disk feature could be turned off using a pair of configuration settings,
enable_partitioned_aggregation=false and enable_partitioned_hash_join=false.

The latest improvements in the spill-to-disk mechanism, and related features that interact with it, make this feature robust enough that disabling it is now no longer needed or supported. In particular, some new features in Impala 2.5 and higher do not work when the spill-to-disk feature is disabled.
+++++

 

If spill-to-disk is enabled, is there an option to monitor the spill-to-disk instances so that I can monitor the query that is causing it.

11 REPLIES 11

avatar

 

The CM queries tab keeps track of "Memory Spilled" per query. You can choose to display it via "select attributes" and also search for queries based on memory_spilled in the search box. If you click the down array next to the query and look at "query details", the information is in there too.

 

The "Utilization Report" UI also has some aggregate information about memory spilled per resource pool.

 

 

cm-spilled.png

avatar
Explorer

Thanks Tim for your reply.

 

The first option will display the details of the memory spilled per query from the impala query section if that attribute is selected to be displayed.

 

The second option under the utilization section will give us the details of average spill and maximum spill per resource pool.

 

My requirement is that, if I am able enabling spill to disk feature in my cluster, I want to be notified if any spill to disk is happening. Do we have any option in cloudera manager to create this alert? 

 

avatar
Explorer

Is spill-to-disk being logged in any logs? if yes, I can set up an alert from the Splunk.

avatar

Depending on exactly what you want to trigger on, you can use the generic function in CM to trigger based on any tsquery expression: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_triggers_usecases.html . There are a number of metrics tracking spill-to-disk: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_metrics_impala.html

 

I don't fully understand the goal though - generally spill-to-disk happens transparently as part of normal query processing when memory is constrained and isn't cause for concern.

 

If your aim is to prevent runaway spilling, the scratch_limit query option is a direct way to do that: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_scratch_limit.html . You can set the default query option globally or set default query options per-resource-pool via the "Dynamic Resource Pools" UI in CM. https://www.cloudera.com/documentation/enterprise/latest/topics/impala_disable_unsafe_spills.html is also occasionally useful.

avatar
Explorer

Hi Tim,

 

Thank you for your inputs. 

 

I have checked the matrics links given. Is unit "queries per second"  is a list of individual queries or the count of the queries?

 

ex:

 

Metric Name Description Unit Parents CDH Version

queries_spilled_memory_rateImpala queries that spilled to diskqueries per secondclusterCDH 5

 

I  need to track specific queries that are spilling to disk. Means if I am enabling the spill-to-disk option, I need to get an alert if a specific query is spilling memory to disk with the query details so that I can notify the owner of that query.

 

I can set the scratch limit to a specific value to control spill space usage. However, need to track each and every query that spills to disk.

avatar
Explorer

I am using Splunk in my environment. Is it being logged anywhere while a query is spilling to disk? If yes, I can create an alert from Splunk to notify me the query details while spilling to disk happens. 

avatar

I'm planning to get back to you with an answer - just haven't been able to find the time yet 🙂

avatar
Explorer

Any Luck?

avatar

I looked into it and we don't currently support per-query alerts. I passed along this feedback to the Cloudera Manager team. I guess we already covered it, but my two suggestions would be:

  • Set a default scratch_limit per-pool or globally so that users don't accidentally write queries that spill a lot of data
  • Set up monitoring for some aggregate threshold, then use the queries page to discover the spilling queries.

My philosophy on this is that spilling queries are nothing to be concerned about as long as queries are completing fast enough for your needs.