Support Questions

Find answers, ask questions, and share your expertise

IMPALAD_QUERY_MONITORING_STATUS has become bad

avatar
Expert Contributor

We are receiving lot many of these alerts when we run a lot many queries, we just moved to CDH 5.7.1, previously with same configurations on CDH 5.5.1 we were not reciving such alerts or issues, can anyone help us to know what may be the reason behing this and how to resolve it.

 2.5.0+cdh5.7.1+0

 

 

The health test result for IMPALAD_QUERY_MONITORING_STATUS has become bad: There are 1 error(s) seen monitoring executing queries, and 0 errors(s) seen monitoring completed queries for this role in the previous 5 minute(s). Critical threshold: any.

followed by following warnings:

The health test result for IMPALA_IMPALADS_HEALTHY has become bad: Healthy Impala Daemon: 9. Concerning Impala Daemon: 0. Total Impala Daemon: 10. Percent healthy: 90.00%. Percent healthy or concerning: 90.00%. Critical threshold: 90.00%.

 

10 REPLIES 10

avatar
Contributor

Hi Jais,

 

Is there anymore information regarding the errors mentioned? Also, are you encountering any problems using Impala? If not, it would be worth posting in the Cloudera Manger page as well: https://community.cloudera.com/t5/Cloudera-Manager-Installation/bd-p/CMInstall

 

- Sailesh

avatar
Explorer

Hi Jais,

 

Even i have the same issue after moving to CDH 5.7.1.

 

Here is my post on the same issue below.

 

https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/ImpalaDaemonQueryMonitoring-throttling...

 

It would be great if you any one could help us on this?

 

Thanks,

Suresh

avatar
Explorer

I'm using Cloudera Enterprise 5.11.1, getting lot of "IMPALAD_QUERY_MONITORING_STATUS has become bad". Alerts. Any help would be greatly appreciated.

 

Healthy Impala Daemon: 17. Concerning Impala Daemon: 1. Total Impala Daemon: 22. Percent healthy: 77.27%. Percent healthy or concerning: 81.82%. Critical threshold: 90.00%.

 

 

Health Test Name

Event Code

Severity

Content

IMPALAD_QUERY_MONITORING_STATUS

Role health test bad

Critical

The health test result for IMPALAD_QUERY_MONITORING_STATUS has become bad: There are 1 error(s) seen monitoring executing queries, and 0 errors(s) seen monitoring completed queries for this role in the previous 5 minute(s). Critical threshold: any.

avatar

I'd suggest looking at the Impala->Queries page in Cloudera manager to see which queries failed.  You can filter the queries by selecting "failed or cancelled" in the drop-down next to "Search" or, equivalently, use this query in the search box:

 

query_state = EXCEPTION

The alert doesn't distinguish between causes of failure so it could be something innocent (a user is trying to develop a query and getting a lot of syntax errors) or it could be a sign of a bigger problem.

avatar
Explorer
Thanks Tim. very helpful. I noticed few queries causing issues, looking into it. do you think we need to reconfigure any impala db settings to handle such type of queries.

avatar

@ClouderaksI think it depends on the nature of the failures - if they're critical queries or just users messing around.

avatar
Explorer

They are critical queries. we are keep getting these alerts everyday. I'm not sure how to fix them.

avatar
Explorer

ERROR: 2246 Monitor-HostMonitor Throttling logger – (30 skipped) could not find local file system for /var/run/cloudera-scm-agent/process error

 

I’m getting “Could not find local file system for /var/run/cloudera-scm-agent/process” error.  I am having issue only on test environment, my development and production environments are fine and not getting these alerts even though could not find local file system error occuring on dev and production environments. Dev and prod environments not getting "IMPALAD_QUERY_MONITORING_STATUS has become bad" Alerts. I'm not sure why those environments not getting alerts. I could see “Could not find local file system for /var/run/cloudera-scm-agent/process” error only on test environment. Do you think any impala settings are different on test compare with  Prod and dev and environemnts or is the issue related to tmpfs file system?

 

/var/run/cloudera-scm-agent/process  file system type is tmpfs, do you think I need increase the size of tmpfs. I noticed that “ grep MemFree /proc/meminfo  MemFree: 5021480 kB” it’s 5GB, so tmpfs may not be the issue.

I hope this folder should be very small, so most likely the tmpfs size may  not be the issue. Do you think we need to check the size of var/run/cloudera-scm-agent/process ? and make sure it's not more than 5GB?. 

 

Any suggestions would be greatly appreciated.

avatar
New Contributor

That's very useful