Support Questions
Find answers, ask questions, and share your expertise

Storm custom alert for Ambari 2.2.2

Highlighted

Storm custom alert for Ambari 2.2.2

Explorer

Currently there is no alerts defined for storm in ambari. I am trying to develop custom alert about the usage of storm slots. One easy way is to create a type:script alert and in my script query the parameter from ambari. But I want to create a type:metric alert like how it is done for HDFS disk usage.

The alert looks like :

{
  "href" : "http://[[hostname]]/api/v1/clusters/star_stage/alert_definitions/27",
  "AlertDefinition" : {
    "cluster_name" : "nameofmycluster",
    "component_name" : "NAMENODE",
    "description" : "This service-level alert is triggered if the HDFS capacity utilization exceeds the configured warning and critical thresholds. It checks the NameNode JMX Servlet for the CapacityUsed and CapacityRemaining properties. The threshold values are in percent.",
    "enabled" : true,
    "id" : 27,
    "ignore_host" : false,
    "interval" : 2,
    "label" : "HDFS Capacity Utilization",
    "name" : "namenode_hdfs_capacity_utilization",
    "scope" : "ANY",
    "service_name" : "HDFS",
    "source" : {
      "jmx" : {
        "property_list" : [
          "Hadoop:service=NameNode,name=FSNamesystemState/CapacityUsed",
          "Hadoop:service=NameNode,name=FSNamesystemState/CapacityRemaining"
        ],
        "value" : "{0}/({0} + {1}) * 100"
      },
      "reporting" : {
        "ok" : {
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "warning" : {
          "value" : 80.0,
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "critical" : {
          "value" : 90.0,
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "units" : "%"
      },
      "type" : "METRIC",
      "uri" : {
        "http" : "{{hdfs-site/dfs.namenode.http-address}}",
        "https" : "{{hdfs-site/dfs.namenode.https-address}}",
        "https_property" : "{{hdfs-site/dfs.http.policy}}",
        "https_property_value" : "HTTPS_ONLY",
        "default_port" : 0.0,
        "high_availability" : {
          "nameservice" : "{{hdfs-site/dfs.nameservices}}",
          "alias_key" : "{{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}",
          "http_pattern" : "{{hdfs-site/dfs.namenode.http-address.{{ha-nameservice}}.{{alias}}}}",
          "https_pattern" : "{{hdfs-site/dfs.namenode.https-address.{{ha-nameservice}}.{{alias}}}}"
        }
      }
    }
  }
}

My question is for storm I don't see any jmx metrics configured in metrics.json. So I tried doing the following :

 {
  "AlertDefinition" : {
    "cluster_name" : "star_stage",
    "component_name" : "NIMBUS",
    "description" : "test storm slots",
    "ignore_host" : false,
    "interval" : 2,
    "label" : "storm Capacity Utilization",
    "name" : "storm_supervisor_utilization",
    "scope" : "ANY",
    "service_name" : "STORM",
    "source" : {
      "ganglia" : {
        "property_list" : [

//name is same what I found in metrics.json
          "Total Slots",
          "Used Slots"
        ],
        "value" : "{1}/{0} * 100"
      },
      "reporting" : {
        "ok" : {
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "warning" : {
          "value" : 80.0,
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "critical" : {
          "value" : 90.0,
          "text" : "Capacity Used:[{2:.0f}%, {0}], Capacity Remaining:[{1}]"
        },
        "units" : "%"
      },
      "type" : "METRIC",
      "uri" : {
        "http" : "www.google.com"
      }
    }
  }
}

I am not sure how this alert is working. does it use the link in uri to get the metrics? How do I use the ganglia metrics in the alert definition to generate alert. You are the expert on alerting @Jonathan Hurley

Storm metrics.json:

https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/STORM/0...

What is significance of component and hostComponent in metrics.json

"NIMBUS": {
  "Component": [
    {
      "type": "ganglia",
      "metrics": {"default": {
      .....}}}
],
"HostComponent": [
  {
    "type": "ganglia",
    "metrics": {
      "default": {
        "metrics/boottime": {

5 REPLIES 5
Highlighted

Re: Storm custom alert for Ambari 2.2.2

Super Collaborator

You can't use Ganglia metrics with alerts; Alerts only supports JMX and AMS style metrics. If Storm doesn't expose JMX-style metrics, then I don't think you can use a METRIC alert here.

Re: Storm custom alert for Ambari 2.2.2

Explorer

To correct my understanding I can see ambari is showing the details about storm in widgets and on the dashboard. From where ambari is picking up the details? I think metrics.json is the link between ambari and the service, correct me if I am wrong. Thanks @Jonathan Hurley

Highlighted

Re: Storm custom alert for Ambari 2.2.2

Super Collaborator

Yes, metrics.json defines the metrics which are pulled into Ambari and then displayed. In the version of Ambari that you're using there are 2 types of metrics for STORM as defined by https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/STORM/0...

These are either "RestMetricsPropertyProvider" metrics which are pulled in via a REST API exposed by Storm or ganglia/ams metrics. In your version of Ambari, I don't believe that AMS metrics are supported for alerts. Since Storm doesn't expose metrics in a fashion similar to JMX, alerts won't be able to consume them.

Highlighted

Re: Storm custom alert for Ambari 2.2.2

Explorer

ok as far as I understand: code which does the alerting is not able to understand metrics from storm (ganglia or REST API) but widgets are able to show the same?

Highlighted

Re: Storm custom alert for Ambari 2.2.2

Super Collaborator

That is correct; Alerts only have the ability to read from JMX metrics in Ambari 2.2.x. In Ambari 2.4, https://issues.apache.org/jira/browse/AMBARI-15766 adds the ability to read metrics from AMS. So you could potentially use that when 2.4.0 is released.