Member since
05-02-2016
74
Posts
41
Kudos Received
14
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3775 | 07-11-2018 01:40 PM | |
7518 | 01-05-2017 02:43 PM | |
1692 | 12-20-2016 01:17 PM | |
1565 | 12-02-2016 07:19 PM | |
2378 | 10-06-2016 01:29 PM |
07-11-2018
01:40 PM
1 Kudo
You should be able to get Spark 2.3 and only "install" it on the edge node. Since spark-submit will essentially start a YARN job, it will distribute the resources needed at runtime. One thing to note is that the external shuffle service will still be using the HDP-installed lib, but that should be fine. More info here: https://spark.apache.org/docs/latest/running-on-yarn.html ; keep in mind that if you get stuck, Hortonworks support will likely be limited in the amount of assistance (given the HDP installed version of Spark is what is within the support scope).
... View more
04-19-2017
11:52 AM
These checklists help ensure that you avoid unexpected problems when adding new hosts to an HDP cluster.
Preparation checklist (what to do before adding the new hosts via the Ambari Add Host wizard):
Have you followed the guidance in the Ambari installation documentation (e.g. example install doc is here)? It is recommended to follow the instructions for the HDP version that is installed on the cluster (go to http://docs.hortonworks.com).
Does your file system partitioning scheme, storage options, memory, network, and power configurations match what is used for other nodes on the cluster? It is good to either follow the pattern of what is already on the cluster or to follow recommendations from docs.hortonworks.com (see example doc here). Compare and verify that settings and configurations match, including: THP, selinux, ssh keys, user integration, LDAP/AD setup, SSSD integration, drive layouts, etc.
Have you verified that JAVA_HOME is set properly with same Java version across the cluster? Having mismatched JDK versions across your nodes may cause failures and should be remediated for optimal cluster operation.
Do the new nodes have the same version of linux, yum, rpm, scp, python and curl as the pre-existing cluster nodes?
Do all the /etc/hosts files match and reflect the newly added and existing nodes? Check DNS across the nodes to confirm that host names resolve properly
Have you enabled NTP on all the newly added nodes and make sure they are in sync with other nodes?
Have you confirmed that iptables are updated or turned off? If you have iptables or firewall settings for port make sure the ports open for new hosts.
Have you confirmed that the user that Ambari runs as can communicate with new nodes via passwordless SSH from ambari-server to the new nodes?
Have you verified that the Ambari agents on the new nodes match the ambari-agent version on the existing nodes?
If using SSL (for HDFS, YARN, etc.), have you installed certificates on the new nodes?
In case if you are adding the new nodes pulling them from other cluster, make sure you clear the all the previous installs or reformat is good option.
Post node addition checklist:
When adding the new hosts to the cluster (see “Adding Hosts to a Cluster” in the Ambari User Guide), have you set the rack properly for the new nodes?
Do you have a strategy in place for rebalancing the cluster that accounts for cluster load? Consider rebalancing in quiet times or running with low bandwidth and throttled if the cluster load will be high during the rebalancing.
Have you run smoke tests that start an Application Master on one of the new nodes?
... View more
04-03-2017
07:10 PM
5 Kudos
1.0. Overview Monitoring enterprise applications and platforms is key responsibility of operations teams. Hortonworks HDP platform uses Ambari for monitoring and alerting, but it is unrealistic, in an enterprise environment, for operations teams to monitor each instance’s monitoring interface in its own front-end. For instance, some operations teams have to monitor 10s or 100s of HDP clusters as well as other systems, including infrastructure related systems, third party applications and databases. Monitoring each Ambari instance’s alerting GUI would be tedious. Therefore, a central dashboard is often used to monitor a broad set of clusters, platforms and applications. SNMP (Standard Network Management Protocol) provides a mechanism to achieve this goal of monitoring a broad set of applications, platforms and infrastructure, and Ambari supports SNMP alerting. This article provides a brief overview of SNMP and a guide for how to integrate HDP platform monitoring with OpenNMS. OpenNMS is an open source monitoring tool that supports SNMP, but this article should help provide enough details such that HDP monitoring can be configured for any of the other monitoring tools in the market that support SNMP. This article focuses on OpenNMS as the monitoring management system, but the article should be able to serve as a guide for use with other systems that can consume SNMP traps. 1.1. Assumptions This article assumes you are using CentOS 6 or 7 and already have HDP and Ambari 2.4 or lower running. 2. SNMP Intro SNMP is a protocol specifically for monitoring. This article will not go too deep into SNMP, but if you want to know more you can always Google SNMP or checkout the wikipedia site here. Here are some SNMP related definitions you’ll need to know for the purposes of using Ambari and OpenNMS:
OID - this is an object identifier. If you think about the fact that SNMP can be used to monitor all types of devices, then it makes sense that you would want some identifier to help segregate monitoring traffic. Ambari’s OID is 1.3.6.1.4.1.18060.16.1.1. The OID is hierarchical in nature and, by appending additional numbers/periods, “var bindings” are defined. MIB - the “management information base” contains the definitions for the various events for a device. A device, like a network switch, or a service, like Ambari, may have a MIB definition file. You can think of the MIB as a contract or interface definition that allows consumers of SNMP messages to have more context about an device or service’s events. Ambari’s MIB file, accessible here, contains the SNMP definition that allows consumers of SNMP Traps (events from an SNMP-supported device or protocol). Variable Bindings (“var bindings) - this is a mechanism for associating values to the variables defined in the MIB with the child OID keys. SNMP Trap - an SNMP trap is an alert messages sent from a remote SNMP-enabled device (e.g. Ambari) to a central collector, the "SNMP Monitoring Tool." 3. Notes on SNMP & Ambari 2.4 (and below) SNMP is just a protocol, so it is important to understand the different ways it can be used when monitoring an HDP cluster. Let’s look at three different approaches. The last approach (Approach #3) is the one that will be used in the rest of this document, and it is the best approach to use when you’re using Ambari 2.4 or below. If you’re using Ambari 2.5 or higher, Approach #1 is the best. We’ll explain why in these next sections: 3.1. Approach 1: Ambari -> SNMP Monitoring Tool This approach is documented in the standard HDP Ambari documentation (here for HDP 2.4). This approach is the easiest to setup, however, it is not recommended when using Ambari 2.4 or lower. The reason is that the SNMP structure that Ambari sends is not robust. Only two “var bindings” are leveraged in this native setup. The script approach (Approach #3 below) is able to leverage some additional values so that the full set of parameters for an alert (what you see in Ambari’s UI) is provided as SNMP trap payload. 3.2. Approach 2: Ambari -> local SNMP Daemon -> SNMP Monitoring Tool SNMP daemons (snmpd) can be run on host machines across the cluster. And the host’s snmpd config can be setup to publish alerts from Ambari. The snmpd daemon is an SNMP agent which binds to a port and awaits requests from SNMP monitoring tool. OpenNMS, for instance, can scan a range of IP Addresses and discover the SNMP services that are running. The approach of having your monitoring tool communicate with the remote hosts’ snmp daemons is beneficial because there are other machine-level metrics and services that may be monitored, beyond what Ambari provides out-of-the-box with its set of alerts. When using Ambari 2.4 or lower, you’ll still want to go with Approach #3 to harness the more robust SNMP trap payload. 3.3. Approach 3: Ambari -> SNMP Script -> SNMP Monitoring Tool This approach is preferred because the SNMP script will help put the SNMP payload together properly so that it can be leveraged within the Monitoring Tool. There are a few steps to follow on the host where Ambari is running. The example script in the Ambari github repo can be used as an example for sending snmp traps to a local snmptrapd. However, for this article, we’ll make some small changes so that the final SNMP traps get sent to the host running OpenNMS. Details are below. 4. Preparation 4.1. Ambari and HDP The remainder of this article assumes you are using Ambari 2.4 and HDP 2.5. If you are using a lower version of Ambari, it is best to check the documentation. For Ambari 2.4 the documentation for setting up SNMP notifications is here: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-user-guide/content/configuring_notifications.html 4.2. OpenNMS There are several ways to install OpenNMS. The method we used was the yum approach. Details are here: https://wiki.opennms.org/wiki/Installation:Yum 5. Configuring Ambari 5.1. Setting up SNMP Alerts in Ambari These instructions should be followed on the host running Ambari. The instructions below were adapted from this article: https://github.com/apache/ambari/blob/trunk/contrib/alert-snmp-mib/README.md 5.1.1. Install SNMP processes Run this command: yum install net-snmp net-snmp-utils net-snmp-libs -y 5.1.2. Download the Ambari MIB Copy the APACHE-AMBARI-MIB.txt file to /usr/share/snmp/mibs Here is the source location for the MIB file: https://issues.apache.org/jira/secure/attachment/12761892/APACHE-AMBARI-MIB.txt (you will need to download the file) 5.1.3. Configure Ambari to leverage the script Run this command:
echo “org.apache.ambari.contrib.snmp.script=/tmp/snmp_mib_script.sh” >> /etc/ambari-server/conf/ambari.properties (note: the command wraps in this document but should be entered as a single command 5.1.4. Set Up the Ambari Notification Run this command: curl -u admin:$PASSWORD -H 'X-Requested-By: ambari' http://$AMBARI_HOST:8080/api/v1/alert_targets
-d '{
"href" : "http://$AMBARI_HOST:8080/api/v1/alert_targets",
"items" : [
{
"href" : "http://$AMBARI_HOST:8080/api/v1/alert_targets/1",
"AlertTarget" : {
"description" : "SNMP MIB Target",
"id" : 1,
"name" : "SNMP_MIB",
"notification_type" : "ALERT_SCRIPT"
}
}
]
}' Note: Make sure to set these environment variables ahead of time:
AMBARI_HOST PASSWORD This POST to Ambari’s REST API will setup an alert notification in Ambari. To review the outcome of the command, in Ambari, navigate to Alerts -> Manage Notifications and you will see something like this: 5.1.5. Set up the script Create the script /tmp/snmp_mib_script.sh using your favorite editor (e.g. “vi /tmp/snmp_mib_script.sh”) and use the contents below (note: put in the IP Address of the host where you will be running OpenNMS where you see “<OpenNMS Host IP Address>”): HOST=<OpenNMS Host IP Address>
COMMUNITY=public
STATE=0
if [ $4 == "OK" ]; then
STATE=0
elif [ $4 == "UNKNOWN" ]; then
STATE=1
elif [ $4 == "WARNING" ]; then
STATE=2
elif [ $4 == "CRITICAL" ]; then
STATE=3
fi
echo /usr/bin/snmptrap -v 2c \
-c public $HOST '' APACHE-AMBARI-MIB::apacheAmbariAlert \
alertDefinitionId i 0 \
alertDefinitionName s "$1" \
alertDefinitionHash s "n/a" \
alertName s "$2" \
alertText s "$5" \
alertState i $STATE \
alertHost s `hostname` \
alertService s "$3" \
alertComponent s Ambari >> /tmp/traps.log
/usr/bin/snmptrap -v 2c \
-c public $HOST '' APACHE-AMBARI-MIB::apacheAmbariAlert \
alertDefinitionId i 0 \
alertDefinitionName s "$1" \
alertDefinitionHash s "n/a" \
alertName s "$2" \
alertText s "$5" \
alertState i $STATE \
alertHost s `hostname` \
alertService s "$3" \
alertComponent s Ambari Run this command to make the file executable:
chmod +x /tmp/snmp_mib_script.sh 5.2. Testing the Alerts You’ll notice that the script above will, in addition to using snmptrap to send the trap (snmp event) to OpenNMS, log the command used to this file: /tmp/traps.log. This is handy to ensure that Ambari’s alert is actually invoking the script. If you run the test below and do not see results in the file then you should go back and check that the previous steps were taken. 5.2.1. Here’s how to test Find a component or service that you are OK disabling or restarting. Be careful if you are doing this in a Production environment. For my test, I restarted one of the Zookeeper instances, and in the /tmp/traps.log file I saw this line printed: /usr/bin/snmptrap -v 2c -c public 172.26.94.48 APACHE-AMBARI-MIB::apacheAmbariAlert alertDefinitionId i 0 alertDefinitionName s zookeeper_server_process alertDefinitionHash s n/a alertName s ZooKeeper Server Process alertText s Connection failed: [Errno 111] Connection refused to node3.openstacklocal:2181 alertState i 3 alertHost s node1.openstacklocal alertService s ZOOKEEPER alertComponent s Ambari What this test verifies is that Ambari is invoking the script when an alert has been triggered. The commands are logged, so then you can use these commands later to test OpenNMS’s connectivity without needing to repeatedly restart components in your cluster. 6. Configuring OpenNMS 6.1. Importing the Ambari MIB See section 5.1.2 for instructions on how to download the MIB file, APACHE-AMBARI-MIB.txt. Once you have it downloaded to your local computer, you can install it into OpenNMS. The steps are: 1. Navigate to the Admin -> Configure OpenNMS page 2. Find and click on the link for “SNMP MIB Compiler” 3. Upload the APACHE-AMBARI-MIB.txt file 4. Now the MIB file will appear under the “pending” section. Right-click on the MIB file and select the “Compile” option 5. Now the MIB file will appear under the “compiled” section. Right-click on the MIB file and select “Generate Events” 6. Right-click on the MIB file and select “Generate Data Collection” 6.2. Customizing Events In the previous section, a set of events were generated from the MIB file. However, this is not sufficient for mapping the Ambari severity to OpenNMS’s notion of severity. Therefore, you have to manually create the mappings using var binding rules for all of the Ambari alert severities, which are:
CRITICAL WARNING UNKNOWN OK There is a shortcut, however, to doing this manually! You can use the event file provided and section 6.2.2. Section 6.2.1 will explain the process from scratch. 6.2.1. Manual set up through the OpenNMS UI The following illustrates the steps required to change the single mapping and get the required four to match what Ambari will send: 1. Navigate to the Admin -> Configure OpenNMS page 2. Click on “Customize Event Configurations” 3. Find and choose the Ambari MIB: 4. Initially you will see a single event. Select it and the bottom section will populate. It will look like this: 5. Now scroll down and click edit. The bottom section will populate. It will look something like this: 6. You now will change the name of the event and set it up for CRITICAL severity alerts. Go to “Event UEI” field and change the last part of the alert event to “apacheAmbariAlert-CRITICAL”. The full Event UEI will be “uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-CRITICAL”. 7. Now you need to map the event to a specific value within the trap. This is referred to as the “var binding”. This screenshot shows the values to use for the CRITICAL alerts: 8. In the same screen, choose the Critical severity. The drop-down on the screen looks like this: 9. Click save. 10. At this point you have set up a mapping for Ambari’s CRITICAL event to a distinct event in OpenNMS. 11. Repeat steps 5-10 for OK, WARNING and UNKNOWN severities using the noted differences below After your first pass of steps 5-10, the good news is that you now have Ambari’s CRITICAL severity (var bind 3, value of 6) mapped to an OpenNMS definition. The bad news is that there is no easy way in the OpenNMS UI to clone an event definition. This means you will have to add a new event. Before you click the “Add Event” button, note down the other values that you used for the existing event definition. Then click the “Add Event” button, fill in the copied details. The new event will need to have three differences from the original event that you are copying, specifically for these fields:
Event UEI Severity Mask Varbinds Here are the values you’ll need for the various severities:
OK
Event UEI: uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-OK Severity: Normal Mask Varbinds
Varbind Number: 6 Varbind Value: 0 WARNING
Event UEI: uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-WARNING Severity: Warning Mask Varbinds
Varbind Number: 6 Varbind Value: 2
UNKNOWN
Event UEI: uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-UNKNOWN Severity: Indeterminate Mask Varbinds
Varbind Number: 6 Varbind Value: 1 Here are screenshots of the four severity events: 6.2.2. Shortcut: Events file example Please note that the path to the events file may differ from what you have in your environment based on how you installed OpenNMS. The default path to the events file is: /opt/opennms/etc/events/APACHE-AMBARI-MIB.events.xml You will need to remove the existing contents and replace it with the following contents:
<events xmlns="http://xmlns.opennms.org/xsd/eventconf">
<event>
<mask>
<maskelement>
<mename>id</mename>
<mevalue>.1.3.6.1.4.1.18060.16</mevalue>
</maskelement>
<maskelement>
<mename>generic</mename>
<mevalue>6</mevalue>
</maskelement>
<maskelement>
<mename>specific</mename>
<mevalue>1</mevalue>
</maskelement>
<varbind>
<vbnumber>6</vbnumber>
<vbvalue>3</vbvalue>
</varbind>
</mask>
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-CRITICAL</uei>
<event-label>Ambari: CRITICAL</event-label>
<descr><p>The SNMP trap that is generated as a result of an alert.</p><table>
<tr><td><b>
alertDefinitionName</b></td><td>
%parm[#1]%;</td><td><p></p></td></tr>
<tr><td><b>
alertDefinitionHash</b></td><td>
%parm[#2]%;</td><td><p></p></td></tr>
<tr><td><b>
alertName</b></td><td>
%parm[#3]%;</td><td><p></p></td></tr>
<tr><td><b>
alertText</b></td><td>
%parm[#4]%;</td><td><p></p></td></tr>
<tr><td><b>
alertState</b></td><td>
%parm[#5]%;</td><td><p>
ok(0)
unknown(1)
warning(2)
critical(3)
</p></td></tr>
<tr><td><b>
alertHost</b></td><td>
%parm[#6]%;</td><td><p></p></td></tr>
<tr><td><b>
alertService</b></td><td>
%parm[#7]%;</td><td><p></p></td></tr>
<tr><td><b>
alertComponent</b></td><td>
%parm[#8]%;</td><td><p></p></td></tr></table></descr>
<logmsg dest="logndisplay"><p>
apacheAmbariAlert trap received
alertDefinitionName=%parm[#1]%
alertDefinitionHash=%parm[#2]%
alertName=%parm[#3]%
alertText=%parm[#4]%
alertState=%parm[#5]%
alertHost=%parm[#6]%
alertService=%parm[#7]%
alertComponent=%parm[#8]%</p></logmsg>
<severity>Critical</severity>
<varbindsdecode>
<parmid>parm[#6]</parmid>
<decode varbindvalue="0" varbinddecodedstring="ok"/>
<decode varbindvalue="1" varbinddecodedstring="unknown"/>
<decode varbindvalue="2" varbinddecodedstring="warning"/>
<decode varbindvalue="3" varbinddecodedstring="critical"/>
</varbindsdecode>
</event>
<event>
<mask>
<maskelement>
<mename>id</mename>
<mevalue>.1.3.6.1.4.1.18060.16</mevalue>
</maskelement>
<maskelement>
<mename>generic</mename>
<mevalue>6</mevalue>
</maskelement>
<maskelement>
<mename>specific</mename>
<mevalue>1</mevalue>
</maskelement>
<varbind>
<vbnumber>6</vbnumber>
<vbvalue>2</vbvalue>
</varbind>
</mask>
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-WARNING</uei>
<event-label>Ambari: WARNING</event-label>
<descr><p>The SNMP trap that is generated as a result of an alert.</p><table>
<tr><td><b>
alertDefinitionName</b></td><td>
%parm[#1]%;</td><td><p></p></td></tr>
<tr><td><b>
alertDefinitionHash</b></td><td>
%parm[#2]%;</td><td><p></p></td></tr>
<tr><td><b>
alertName</b></td><td>
%parm[#3]%;</td><td><p></p></td></tr>
<tr><td><b>
alertText</b></td><td>
%parm[#4]%;</td><td><p></p></td></tr>
<tr><td><b>
alertState</b></td><td>
%parm[#5]%;</td><td><p>
ok(0)
unknown(1)
warning(2)
critical(3)
</p></td></tr>
<tr><td><b>
alertHost</b></td><td>
%parm[#6]%;</td><td><p></p></td></tr>
<tr><td><b>
alertService</b></td><td>
%parm[#7]%;</td><td><p></p></td></tr>
<tr><td><b>
alertComponent</b></td><td>
%parm[#8]%;</td><td><p></p></td></tr></table></descr>
<logmsg dest="logndisplay"><p>
apacheAmbariAlert trap received
alertDefinitionName=%parm[#1]%
alertDefinitionHash=%parm[#2]%
alertName=%parm[#3]%
alertText=%parm[#4]%
alertState=%parm[#5]%
alertHost=%parm[#6]%
alertService=%parm[#7]%
alertComponent=%parm[#8]%</p></logmsg>
<severity>Warning</severity>
<varbindsdecode>
<parmid>parm[#6]</parmid>
<decode varbindvalue="0" varbinddecodedstring="ok"/>
<decode varbindvalue="1" varbinddecodedstring="unknown"/>
<decode varbindvalue="2" varbinddecodedstring="warning"/>
<decode varbindvalue="3" varbinddecodedstring="critical"/>
</varbindsdecode>
</event>
<event>
<mask>
<maskelement>
<mename>id</mename>
<mevalue>.1.3.6.1.4.1.18060.16</mevalue>
</maskelement>
<maskelement>
<mename>generic</mename>
<mevalue>6</mevalue>
</maskelement>
<maskelement>
<mename>specific</mename>
<mevalue>1</mevalue>
</maskelement>
<varbind>
<vbnumber>6</vbnumber>
<vbvalue>1</vbvalue>
</varbind>
</mask>
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-UKNOWN</uei>
<event-label>Ambari: UNKNOWN</event-label>
<descr><p>The SNMP trap that is generated as a result of an alert.</p><table>
<tr><td><b>
alertDefinitionName</b></td><td>
%parm[#1]%;</td><td><p></p></td></tr>
<tr><td><b>
alertDefinitionHash</b></td><td>
%parm[#2]%;</td><td><p></p></td></tr>
<tr><td><b>
alertName</b></td><td>
%parm[#3]%;</td><td><p></p></td></tr>
<tr><td><b>
alertText</b></td><td>
%parm[#4]%;</td><td><p></p></td></tr>
<tr><td><b>
alertState</b></td><td>
%parm[#5]%;</td><td><p>
ok(0)
unknown(1)
warning(2)
critical(3)
</p></td></tr>
<tr><td><b>
alertHost</b></td><td>
%parm[#6]%;</td><td><p></p></td></tr>
<tr><td><b>
alertService</b></td><td>
%parm[#7]%;</td><td><p></p></td></tr>
<tr><td><b>
alertComponent</b></td><td>
%parm[#8]%;</td><td><p></p></td></tr></table></descr>
<logmsg dest="logndisplay">New Event Log Message</logmsg>
<severity>Indeterminate</severity>
<varbindsdecode>
<parmid>parm[#6]</parmid>
<decode varbindvalue="0" varbinddecodedstring="ok"/>
<decode varbindvalue="1" varbinddecodedstring="unknown"/>
<decode varbindvalue="2" varbinddecodedstring="warning"/>
<decode varbindvalue="3" varbinddecodedstring="critical"/>
</varbindsdecode>
</event>
<event>
<mask>
<maskelement>
<mename>id</mename>
<mevalue>.1.3.6.1.4.1.18060.16</mevalue>
</maskelement>
<maskelement>
<mename>generic</mename>
<mevalue>6</mevalue>
</maskelement>
<maskelement>
<mename>specific</mename>
<mevalue>1</mevalue>
</maskelement>
<varbind>
<vbnumber>6</vbnumber>
<vbvalue>0</vbvalue>
</varbind>
</mask>
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-OK</uei>
<event-label>Ambari: OK</event-label>
<descr><p>
apacheAmbariAlert trap received
alertDefinitionName=%parm[#1]%
alertDefinitionHash=%parm[#2]%
alertName=%parm[#3]%
alertText=%parm[#4]%
alertState=%parm[#5]%
alertHost=%parm[#6]%
alertService=%parm[#7]%
alertComponent=%parm[#8]%</p></descr>
<logmsg dest="logndisplay"><p>
apacheAmbariAlert trap received
alertDefinitionName=%parm[#1]%
alertDefinitionHash=%parm[#2]%
alertName=%parm[#3]%
alertText=%parm[#4]%
alertState=%parm[#5]%
alertHost=%parm[#6]%
alertService=%parm[#7]%
alertComponent=%parm[#8]%</p></logmsg>
<severity>Normal</severity>
<varbindsdecode>
<parmid>parm[#6]</parmid>
<decode varbindvalue="0" varbinddecodedstring="ok"/>
<decode varbindvalue="1" varbinddecodedstring="unknown"/>
<decode varbindvalue="2" varbinddecodedstring="warning"/>
<decode varbindvalue="3" varbinddecodedstring="critical"/>
</varbindsdecode>
</event>
</events>
6.3. Creating Notifications If a team will be monitoring OpenNMS, you will want a mechanism for them to acknowledge alerts so that there is no duplication of effort and so that there is some audit capabilities and accountability. Thus you need to map the various Ambari event severities to notifications. Without this step, OpenNMS will still recognize the events, but it will not make them actionable. Section 6.3.1 will explain how to configure the notifications manually in the OpenNMS UI and Section 6.3.2 provides a shortcut that you may use instead. 6.3.1. Manual set up through the OpenNMS UI The following illustrates the steps required to set up the notification mappings for the various Ambari severity events: 1. Navigate to the Admin -> Configure OpenNMS page 2. Click on “Configure Notifications” 3. Click “Add New Event Notification” 4. Find the “Ambari: CRITICAL” event, select it and click “Next” 5. Choose “Skip results validation” 6. Provide a name, description and text message. Here is what it will look like: 7. Click Finish 8. Repeat steps 3-7 for the other Ambari alert severities (OK, WARNING and UNKNOWN) Once done setting up all the notifications, you should see something like this in the main “Edit Notifications” screen: 6.3.2. Shortcut: Notifications snippet Please note that the path to the events file may differ from what you have in your environment based on how you installed OpenNMS. The default path to the events file is: /opt/opennms/etc/notifications.xml You will need to keep the existing contents and add the following contents (at the end of the file but before the “</notifications>” line). Also, rename “CLUKASIK CLUSTER01” with something to suit your needs.
<notification name="CLUKASIK CLUSTER01 CRITICAL ALERT" status="on" writeable="yes">
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-CRITICAL</uei>
<description>CLUSTER01 ALERT %parm[#2]% DETECTED (Sev: %parm[#5]%; Node: %interface%)</description>
<rule>(IPADDR IPLIKE *.*.*.*)</rule>
<destinationPath>Email-Admin</destinationPath>
<text-message>CLUSTER01 ALERT
Alert Definition Name: %parm[#2]%
Alert Name: %parm[#4]%
Alert Text: %parm[#5]%
Alert State: %parm[#6]%
Alert Host: %parm[#7]%
Alert Service: %parm[#8]%
Alert Component: %parm[#9]%
%severity%</text-message>
<subject>Notice #%noticeid%</subject>
<numeric-message>111-%noticeid%</numeric-message>
</notification>
<notification name="CLUKASIK CLUSTER01 OK ALERT" status="on" writeable="yes">
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-OK</uei>
<description>CLUSTER01 ALERT %parm[#2]% DETECTED (Sev: %parm[#5]%; Node: %interface%)</description>
<rule>(IPADDR IPLIKE *.*.*.*)</rule>
<destinationPath>Email-Admin</destinationPath>
<text-message>CLUSTER01 ALERT
Alert Definition Name: %parm[#2]%
Alert Name: %parm[#4]%
Alert Text: %parm[#5]%
Alert State: %parm[#6]%
Alert Host: %parm[#7]%
Alert Service: %parm[#8]%
Alert Component: %parm[#9]%
%severity%</text-message>
<subject>Notice #%noticeid%</subject>
<numeric-message>111-%noticeid%</numeric-message>
</notification>
<notification name="CLUKASIK CLUSTER01 UNKNOWN ALERT" status="on" writeable="yes">
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-UKNOWN</uei>
<description>CLUSTER01 ALERT %parm[#2]% DETECTED (Sev: %parm[#5]%; Node: %interface%)</description>
<rule>(IPADDR IPLIKE *.*.*.*)</rule>
<destinationPath>Email-Admin</destinationPath>
<text-message>CLUSTER01 ALERT
Alert Definition Name: %parm[#2]%
Alert Name: %parm[#4]%
Alert Text: %parm[#5]%
Alert State: %parm[#6]%
Alert Host: %parm[#7]%
Alert Service: %parm[#8]%
Alert Component: %parm[#9]%
%severity%</text-message>
<subject>Notice #%noticeid%</subject>
<numeric-message>111-%noticeid%</numeric-message>
</notification>
<notification name="CLUKASIK CLUSTER01 WARNING ALERT" status="on" writeable="yes">
<uei>uei.opennms.org/traps/APACHE-AMBARI-MIB/apacheAmbariAlert-WARNING</uei>
<description>CLUSTER01 ALERT %parm[#2]% DETECTED (Sev: %parm[#5]%; Node: %interface%)</description>
<rule>(IPADDR IPLIKE *.*.*.*)</rule>
<destinationPath>Email-Admin</destinationPath>
<text-message>CLUSTER01 ALERT
Alert Definition Name: %parm[#2]%
Alert Name: %parm[#4]%
Alert Text: %parm[#5]%
Alert State: %parm[#6]%
Alert Host: %parm[#7]%
Alert Service: %parm[#8]%
Alert Component: %parm[#9]%
%severity%</text-message>
<subject>Notice #%noticeid%</subject>
<numeric-message>111-%noticeid%</numeric-message>
</notification> 7. Testing the Alerts To test the alerts, we will be turning off some HDP services. Therefore, if you are doing this in a production environment, you should plan the test for an appropriate time. For our test we will disable one of three Zookeeper instances in the HDP cluster. In Ambari, stop a Zookeeper instance as shown below. Click OK when it asks “Are you sure?” Now watch your OpenNMS dashboard. It may take a minute or two, but eventually you will see some “outstanding notices” under the Notifications section, as shown below: Now, click on the link for the “outstanding notices”. You will see something like the following: Go ahead and click on one of the ID values. This will take you to the notification detail. It will look like this: 8. Summary This document scratches the surface for setting up an SNMP flow from Ambari to OpenNMS. OpenNMS offers more features for making the alerts manageable for an operations team, from providing customizable dashboards to providing a mechanism for managing on-call schedules.
... View more
Labels:
01-05-2017
02:43 PM
1 Kudo
Instead of using the "-incremental" approach, you could use the "--query" or "--where" and bake in your own incremental logic.
... View more
01-04-2017
05:13 PM
1 Kudo
Would this work? select partition_dt, count(*) from db.tablename group by partition_dt
... View more
01-03-2017
09:23 PM
1 Kudo
To query across multiple partitions, you should not need to anything special, other than make sure your where clause is not forcing you into a specific folder. E.g. do not include partition_dt in your query. One comment: Avoid too many partitions. Avoid partitioning that is too granular (unless you are pruning old data) as you will suffer performance problems. I recommend taking a look at this article for some best practices on Hive partitioning.
... View more
12-20-2016
01:17 PM
2 Kudos
One key benefit is that passwords or keys are sent across the network as infrequently as possible with Kerberos. With SSH either passwords are being transmitted or you are persisting files with secret keys, both of which have concerns for security. This article does a great job comparing and contrasting SSH and Kerberos: http://docstore.mik.ua/orelly/networking_2ndEd/ssh/ch11_04.htm "When a user identifies herself to the Kerberos system, the identifying program (kinit) uses her password for an exchange with the KDC, then immediately erases it, never having sent it over the network in any form nor stored it on disk."
... View more
12-02-2016
07:19 PM
2 Kudos
No, they can run on different machines.
... View more
11-30-2016
01:50 PM
@Garima Dosi - I see this WARN in the logs: 2016-11-28 16:25:51,029 WARN [Thread-2037242]: split.JobSplitWriter (JobSplitWriter.java:writeOldSplits(168)) - Max block location exceeded for split: I am not sure if it would lead to the current outcome but it might be a clue.
... View more
11-29-2016
05:20 PM
Are you sure a user (or process) is not issuing "yarn application -kill <app id>"? From the logs, that is what it looks like. What is running on 172.23.35.6 that would have killed the application? 2016-11-28 16:26:03,683 INFO [IPC Server handler 0 on 37719] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job job_1479381018014_95355 received from bdsa_ingest (auth:TOKEN) at 172.23.35.6
2016-11-28 16:26:03,866 INFO [Thread-107] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to Kill job job_1479381018014_95355 received from bdsa_ingest (auth:TOKEN) at 172.23.35.6
... View more