Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Understanding cloudera charts

Understanding cloudera charts

Explorer
Can someone help regarding cloudera charts. I didn't find any good explanation regarding charts available in cloudera manager. How i can correlate those errors with charts. I.e. cpu , io related charts.
7 REPLIES 7

Re: Understanding cloudera charts

Expert Contributor

Re: Understanding cloudera charts

Explorer

Hello manuroman,

How i can map this information to Charts. i..e.  Hive canary chart. How i can map if the charts are associated with an issue. Any detailed information related to charts is required by me. How to corelate an issue with charts.

Re: Understanding cloudera charts

Cloudera Employee

Hi Kamal,

 

If you don't mind, could you please share us which charts you would like to understand and also the errors which you would like to correlate with charts so that we will get a chance to help you in understanding ClouderaManager-->charts

 

Thanks,

Senthil Kumar

Re: Understanding cloudera charts

Explorer
i.e. Hive Metastore Canary Duration.

Re: Understanding cloudera charts

Cloudera Employee
This is a Hive Metastore health test that checks that a client can connect and perform basic operations. The operations include: (1) creating a database, (2) creating a table within that database with several types of columns and two partition keys, (3) creating a number of partitions, and (4) dropping both the table and the database. The database is created under the /user/hue/.cloudera_manager_hive_metastore_canary/<Hive Metastore role name>/ and is named "cloudera_manager_metastore_canary_test_db". The test returns "Bad" health if any of these operations fail. The test returns "Concerning" health if an unknown failure happens. The canary publishes a metric 'canary_duration' for the time it took for the canary to complete. Here is an example of a trigger, defined for the Hive Metastore role configuration group, that changes the health to "Bad" when the duration of the canary is longer than 5 sec: "IF (SELECT canary_duration WHERE entityName=$ROLENAME AND category = ROLE and last(canary_duration) > 5s) DO health:bad" A failure of this health test may indicate that the Hive Metastore is failing basic operations. Check the logs of the Hive Metastore and the Cloudera Manager Service Monitor for more details. This test can be enabled or disabled using the Hive Metastore Canary Health Test Hive Metastore monitoring setting.

Ref: https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hive_metastore_server.html#conc...

Re: Understanding cloudera charts

Explorer

i.e. If i receive the alert with Activity Monitor.

Pause Duration Bad
Average time spent paused was 2 minute(s), 54 second(s) (290.37%) per minute over the previous 5 minute(s). Critical threshold: 60.00%.

 

There are various charts like disk latency,Disk throughput,networkthroughput. Garbage collection time. 

How I can understand due to which this problem occurs in system. Will any of the specific charts help me there.

 

 

Re: Understanding cloudera charts

Cloudera Employee

This is a garbage collection (GC) pause.

Check how much JVM Heap had been used for the service (HS2 etc..) for which you received this Alert.

 

From the alert, you can see that JVM pause takes 2+min and you have configured to alert if GC pause takes 60% of 1min. You should see the JVM Heap Memory Usage and GC pause charts in the Service(for which you see this alert) and check If the heap is constantly high then that is the likely reason. In that case, the solution could be a simple as increasing the heap size.

 

You can refer to Cloudera documents[1][2]

 

[1] https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hiveserver2.html

[2] https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hive_metastore_server.html