Support Questions

Find answers, ask questions, and share your expertise

SERVICE_MONITOR_HEAP_SIZE alert

avatar
Contributor

I am new in CDH cluster setup. I have CDH 6.3.2 with HA enabled. Total 3+5 nodes cluster(3 masters and 5 data nodes)
From last 2 days we received alert from SERVICE_MONITOR_HEAP_SIZE
The health test result for SERVICE_MONITOR_HEAP_SIZE has become bad: Heap used: 2,001M. JVM maximum available heap size: 2,048M. Percentage of maximum heap: 97.71%. Critical threshold: 95.00%.

So I increased heap size to 3.0GiBs. But still we received alert as below
The health test result for SERVICE_MONITOR_HEAP_SIZE has become bad: Heap used: 3,004M. JVM maximum available heap size: 3,072M. Percentage of maximum heap: 97.79%. Critical threshold: 95.00%.

How can I estimate heap size? How can I fix this issue?
Please assist me step by step to fix the issue.
Thank you

6 REPLIES 6

avatar
Guru

Hi @manjj ,

 

Thanks for reaching out to Cloudera community.

 

Could you please share your Cloudera Manager version?

 

Usually the value of the SMON's heap size depends on the how many and what kind of services it monitors. 

This is the public documentation which lists out the requirements:

https://docs.cloudera.com/documentation/enterprise/release-notes/topics/hardware_requirements_guide....

 

To resolve the issue, you need to increase the JVM heap size. I would suggest to bump up each time with 1G. So change to 4G and see how it goes. Also make sure that the SMON host has enough free memory for any OS overhead.

 

Thanks and hope this helps,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Contributor

Thanks lwang, 
I increase the JVM heap size to 5GiBs. lets see how it will work.

 

Version: Cloudera Express 6.3.0 (#1281944 built by jenkins on 20190719-0609 git: 5b793e9c9cb3f40b3912044aac00abb635183191)

Java VM Name: Java HotSpot(TM) 64-Bit Server VM

Java Version: 1.8.0_181

 

 

avatar
Contributor

Hi lwang,

I noticed that we have only 285 entries in service monitor (find from Cloudera Management Service Monitored Entities). Recently I increased heap size to 5GiBs but still received alert.

The health test result for SERVICE_MONITOR_HEAP_SIZE has become bad: Heap used: 4,991M. JVM maximum available heap size: 5,120M. Percentage of maximum heap: 97.48%. Critical threshold: 95.00%.

avatar
Guru

Hi @manjj ,

 

This maybe related to a known issue which we can confirm by trying below workaround:

 

  1. Go to CM > Hive > Configuration
  2. Search for "Hive Metastore Canary Health Test"
  3. Uncheck the box and save the change
  4. Restart Service Monitor

Please report back if above workaround actually make SMON stable and we can talk about next step.

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Contributor

Hi lwang,

As suggested I disabled 'Hive Metastore Canary Health Test' and also reduced heap size from 5GiBs to 2GiBs.

From last 14hours we have not noticed any alert from Service Monitor.

 

Thanks,

avatar
Guru

Hi @manjj,

 

Thanks for reporting back the progress. There is a possible leaking somewhere in Hive even this bug was already fixed.

 

You can follow below steps if you do want to have the Hive Metastore canary test turned on.

Steps:

  1. If Hive Metastore canary test is disabled, re-enable the Hive Metastore canary test.
  2. For configurations of both HDFS and Hive, find Service Monitor Client Config Overrides and add an entry for "fs.file.impl.disable.cache" with value "true".
  3. Restart Service Monitor
  4. And observe whether the heap also stays stable, with the canary back on.

Thanks and hope this helps,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum