About SumitraMenon

SumitraMenon · ‎12-10-2019

Sometimes, a node needs to be decommissioned or have undetermined downtime for repairs. If the node has a Resource Manager, move it to a new host using the Resource Manager Move Wizard from Ambari Web User Interface.The Resource Manager Move wizard describes the set of automated steps to be taken to move one Resource Manager to new host. Since YARN and Mapreduce2 will be restarted, a cluster maintenance windows must be planned and be prepared for cluster downtime. This video describes how to move a Resource Manager to a new host using the Resource Manager Move Wizard from Ambari Web User Interface. Open the video on YouTube here To move YARN Resource Manager to a new host with Ambari, do the following: In Ambari Web, browse to Services > YARN > Summary. Select Service Actions and choose Move ResourceManager. The Move ResourceManager wizard launches, describing a set of automated steps that must be followed to move one ResourceManager to a new host. Click Get Started. This wizard will provide a walk- through to move the ResourceManager. The following services will be restarted as part of the wizard: You should plan a cluster maintenance window and prepare for cluster downtime when moving ResourceManager. YARN MAPREDUCE2. Click Next Select the Target Host. Assign ResourceManager to new host. Click next. Review and confirm the host selections. Expand YARN if necessary, to review all the configuration changes proposed for YARN. Click Deploy to approve the changes and start automatically moving the Resource Manager to a new host. On Configure Components, click Complete when all the progress bars are completed. After reloading the Ambari Web reloads, there will be some alerts. Wait a few minutes until all the services restart. Restart any components using Ambari Web, if necessary. REFERENCE: http://docs.hortonworks.com (official product documentation) http://community.hortonworks.com (community forum)

SumitraMenon · ‎12-10-2019

If an active Resource Manager in a cluster fails, to ensure that another Resource Manager is available, the Resource Manager high availability (HA) should be enabled and configured. In HDP 2.2 or later environment, high availability (HA) can be configured for ResourceManager by using the Enable ResourceManager HA wizard. To ensure this, there must be at least three hosts in the cluster and Apache ZooKeeper servers should be running. The Enable ResourceManager high availability section from the documentation contains the steps mentioned in this video. Open the video on YouTube here Recommended links: Product documentation page Community Forum

SumitraMenon · ‎12-10-2019

To ensure that another NameNode in a cluster is always available when an active NameNode host fails, the NameNode high availability should be enabled and configured on the cluster from the Ambari Web User Interface. This video explains how to enable HA wizard and the steps that must be followed to set up NameNode high availability. Open the YouTube video here As a prerequisite, ensure the following: If the HDFS or ZooKeeper services are in Maintenance Mode the NameNode HA wizard will not complete successfully. HDFS and ZooKeeper must be stopped and started when enabling NameNode HA as the Maintenance Mode will prevent those start and stop operations from occurring. The Enable NameNode high availability section from the documentation contains the steps mentioned in this video. Recommended links: Product documentation page Community Forum

SumitraMenon · ‎12-10-2019

Apache Ranger is one of the easiest, robust and flexible framework to manage the authorization for different components of a cluster. However, if your policies are not syncing, that may become an issue. In this video, we will be troubleshooting the Ranger policy synchronization. We will look into the components that are involved, the components that interact between them and a few other most common issues. The video is to understand what are the main factors to check on a policy synchronization issue and how to solve them. Open the video on YouTube here

SumitraMenon · ‎12-10-2019

The Zookeeper transaction logs and Snapshots files are not human readable by default. Running a cat command on these files do not give clear information on the content of the files. The following video explains how to read Zookeeper transaction logs and Snapshots? Open the video on YouTube here To view the content of these files, use the following: To read the snapshots: java -cp /usr/hdp/current/zookeeper-server/zookeeper.jar:/usr/hdp/current/zookeeper-server/lib/* org.apache.zookeeper.server.SnapshotFormatter <Snapshot file name> To read the transaction logs: java -cp /usr/hdp/current/zookeeper-server/zookeeper.jar:/usr/hdp/current/zookeeper-server/lib/* org.apache.zookeeper.server.LogFormatter <Log file name> The classes that need to be used are located under /usr/hdp/current/zookeeper-server and /usr/hdp/current/zookeeper-server/lib .

SumitraMenon · ‎12-10-2019

This video article provides the steps on how to use Reassign Partitions Tool: Open YouTube video here Create a file named topics-to-move.json with the following content: { "topics": [{"topic":"<partitionName>"}], "version":1 } Run the following command: ./kafka-reassign-partitions.sh --zookeeper master:2181 --topics-to-move-json-file topics-to-move.json --broker-list "<brokerID>" --generate Take the output from previous Proposed partition reassignment configuration and create another json file reassign-partition.json. Run the following command: ./kafka-reassign-partitions.sh --zookeeper master:2181 --reassignment-json-file reassign-partition.json --execute

SumitraMenon · ‎12-10-2019

At times, Kafka Brokers can find one of its log directory utilization at 100% and the broker process would fail to start. This article provides the instructions to manually move partition data between different log directories within a Kafka Broker. Open the video on YouTube here The Kafka brokers maintain two offset checkpoint files inside each log directory: replication-offset-checkpoint recovery-point-offset-checkpoint And both these files have the following format: (a) 1st line: Version Number (b) 2nd line: Number of topic-partition entries in the file (c) All the remaining lines: Replication Offset/Recovery Point Offset, for every partition data maintained within the current log directory.

SumitraMenon · ‎12-10-2019

Ambari Infra Solr can be present on HDP and HDF clusters. HDPSearch Solr is a separate product that can be installed on top of an HDP cluster. This video talks about the goals and differences between them and how to correctly select the product category when opening a support case. Open YouTube video here

SumitraMenon · ‎12-10-2019

This articles describe the steps required to complete the setup for accessing Grafana using HTTPS with CA Signed Certificates. Open YouTube video here Ambari Metrics System includes Grafana, which is a daemon that runs on a specific host in the cluster and serves pre-built dashboards for visualising metrics collected in the Metrics Collector. For this article the following servers is used: 172.25.33.152 c3132-node1.user.local (Ambari Server) 172.25.36.9 c3132-node2.user.local (Ambari Metrics Collector + Grafana) 172.25.40.27 c3132-node3.user.local (Ambari Metrics Collector) 172.25.33.163 c3132-node4.user.local By default, Grafana listen on port TCP/3000: # for i in $(netstat -utnlp | awk '/grafana/ {print substr($7, 1, length($7)-13)}' | sort -u) ; do echo ; ps -eo pid,user,command --cols 128 | grep $i | grep -v grep ; netstat -utnlp | grep $i ; echo ; done 270925 ams /usr/lib/ambari-metrics-grafana/bin/grafana-server --pidfile=/var/run/ambari-metrics-grafana/grafana-server.pid tcp6 0 0 :::3000 :::* LISTEN 270925/grafana-serv Here, the process running is grafana-server, the owner is ams, and is listening on port TCP/3000. All the configurations for Grafana are handled by Ambari, and are reflected in the ams-grafana.ini file located at /etc/ambari-metrics-grafana/conf/ directory. Grafana needs to be restarted for any configuration changes to take effect. In enterprises where security is required, limit the Grafana access to only HTTPS connections. To enable https for Grafana, update the following properties: AmbariUI / Services / Ambari Metrics / Configs -> Advanced ams-grafana-ini protocol: By default, http. For this video we need to change this to https. ca_cert: The path to CA root certificate or bundle to be used to validate the Grafana certificate against. Since we are using a PKCS#12 bundle certificate, we need to extract the CA certificate chain from it. cert_file: The path to the certificate. This certificate nees to be in PEM format. cert_key: The path for the private key that match with the public key of the certificate. This private key needs to be unencrypted RSA private key. For this article, the CA will provide us with a certificate bundle located at: /var/tmp/certificates/GRAFANA Since the certificate information provided by the CA is a PKCS#12 certificate bundle, complete the following steps: Extract the root and intermediate certificates, using the following command: openssl pkcs12 -in c3132-node2.user.local.p12 -out ams-ca.crt -cacerts -nokeys -passin pass:hadoop1234 Extract the server certificate: openssl pkcs12 -in c3132-node2.user.local.p12 -out ams-grafana.crt -clcerts -nokeys -passin pass:hadoop1234 Extract the private key: openssl pkcs12 -in c3132-node2.user.local.p12 -nocerts -nodes -out ams-grafana.key -passin pass:hadoop1234 Copy the certificates to a folder with ams user permissions. For this article, the default path and the default names the following: cp ams-*.* /etc/ambari-metrics-grafana/conf/ chown ams:hadoop /etc/ambari-metrics-grafana/conf/ams-*.* Update the Grafana configuration from Ambari: AmbariUI / Services / Ambari Metrics / Configs -> Advanced ams-grafana-ini protocol = https ca_cert = /etc/ambari-metrics-grafana/conf/ams-ca.crt cert_file = /etc/ambari-metrics-grafana/conf/ams-grafana.crt cert_key = /etc/ambari-metrics-grafana/conf/ams-grafana.key Save the changes, and restart all affected services. Double-check the Grafana log file: tail -f /var/log/ambari-metrics-grafana/grafana.log The following will be the listening on port 3000 over HTTPS: 2018/12/12 03:42:41 [I] Listen: https://0.0.0.0:3000 Double-check the certificate in place using: openssl s_client -connect c3132-node2.user.local:3000 </dev/null Open Grafana from Ambari to validate if its working as expected: AmbariUI / Services / Ambari Metrics / Summary -> Quick Link Grafana Note: Ignore the warning and proceed. With all these steps, Grafana is configured to be used with CA Signed certificates and the communication will be is over HTTPS.

SumitraMenon · ‎12-10-2019

This video explains how to configure Ambari Metrics System AMS High Availability. Open YouTube video here To enable AMS high availability, the collector has to be configured to run in a distributed mode. When the Collector is configured for distributed mode, it writes metrics to HDFS, and the components run in distributed processes, which helps to manage CPU and memory. The following steps assume a cluster configured for a highly available NameNode. Set the HBase Root directory value to use the HDFS name service instead of the NameNode hostname. Migrate existing data from the local store to HDFS prior to switching to a distributed mode. To switch the Metrics Collector from embedded mode to distributed mode, update the Metrics Service operation mode and the location where the metrics are being stored. In summary, the following steps are required: Stop Ambari Metrics System Prepare the Environment to migrate from Local File System to HDFS Migrate Collector Data to HDFS Configure Distributed Mode using Ambari Restart all affected and Monitoring Collector Log Stop all the services associated with the AMS component using Ambari AmbariUI / Services / Ambari Metrics / Summary / Action / Stop Prepare the Environment to migrate from Local File System to HDFS AMS_User=ams AMS_Group=hadoop AMS_Embedded_RootDir=$(grep -C 2 hbase.rootdir /etc/ambari-metrics-collector/conf/hbase-site.xml | awk -F"[<|>|:]" '/value/ {print $4}' | sed 's|//||1') ActiveNN=$(su -l hdfs -c "hdfs haadmin -getAllServiceState | awk -F '[:| ]' '/active/ {print \$1}'") NN_Port=$(su -l hdfs -c "hdfs haadmin -getAllServiceState | awk -F '[:| ]' '/active/ {print \$2}'") HDFS_Name_Service=$(grep -A 1 dfs.nameservice /etc/hadoop/conf/hdfs-site.xml | awk -F"[<|>]" '/value/ {print $3}') HDFS_AMS_PATH=/apps/ams/metrics Create the folder for Collector data in HDFS su -l hdfs -c "hdfs dfs -mkdir -p ${HDFS_AMS_PATH}" su -l hdfs -c "hdfs dfs -chown ${AMS_User}:${AMS_Group} ${HDFS_AMS_PATH}" Update permissions to be able to copy collector data from local file system to HDFS namei -l ${AMS_Embedded_RootDir}/staging chmod +rx ${AMS_Embedded_RootDir}/staging Copy collector data from local file system to HDFS su -l hdfs -c "hdfs dfs -copyFromLocal ${AMS_Embedded_RootDir} hdfs://${ActiveNN} :${NN_Port}${HDFS_AMS_PATH}" su - hdfs -c "hdfs dfs -chown -R ${AMS_User}:${AMS_Group} ${HDFS_AMS_PATH}" Configure collector to distrubute mode using Ambari: AmbariUI / Services / Ambari Metrics / Configs / Metrics Service operation mode = distributed AmbariUI / Services / Ambari Metrics / Configs / Advanced ams-hbase-site / hbase.cluster.distributed = true AmbariUI / Services / Ambari Metrics / Configs / Advanced ams-hbase-site / HBase root directory = hdfs://AMSHA/apps/ams/metrics AmbariUI / Services / HDFS / Configs / Custom core-site hadoop.proxyuser.hdfs.groups = * hadoop.proxyuser.root.groups = * hadoop.proxyuser.hdfs.hosts = * hadoop.proxyuser.root.hosts = * AmbariUI / Services / HDFS / Configs / HDFS Short-circuit read /Advanced hdfs-site = true (check) AmbariUI -> Restart All required Note: Impersonation is the ability to allow a service user to securely access data in Hadoop on behalf of another user. When proxy users is configured, any access using a proxy are executed with the impersonated user's existing privilege levels rather than those of a superuser, like HDFS. The behavior is similar when using proxy hosts. Basically, it limits the hosts from which impersonated connections are allowed. For this article and testing purposes, all users and all hosts are allowed. Additionally, one of the key principles behind Apache Hadoop is the idea that moving computation is cheaper than moving data. With Short-Circuit Local Reads, since the client and the data are on the same node, there is no need for the DataNode to be in the data path. Rather, the client itself can simply read the data from the local disk improving performance Once the AMS is up and running, in the Metrics Collector Log the following message is displayed: 2018-12-12 01:21:12,132 INFO org.eclipse.jetty.server.Server: Started @14700ms 2018-12-12 01:21:12,132 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app timeline started at 6188 2018-12-12 01:21:40,633 INFO org.apache.ambari.metrics.core.timeline.availability.MetricCollectorHAController: ######################### Cluster HA state ######################## CLUSTER: ambari-metrics-cluster RESOURCE: METRIC_AGGREGATORS PARTITION: METRIC_AGGREGATORS_0 c3132-node2.user.local_12001 ONLINE PARTITION: METRIC_AGGREGATORS_1 c3132-node2.user.local_12001 ONLINE ################################################## According to above message, there a cluster with only one collector. The next logical step, will be adding an additional Collector from Ambari Server. To do this, run the following: AmbariUI / Hosts / c3132-node3.user.local / Summary -> +ADD -> Metrics Collector Note: c3132-node3.user.local is the node where you will be adding the Collector. Since distributed mode is already enabled, after adding the collector, start the service. Once the AMS is up and running, the following message is displayed in the Metrics Collector Log: 2018-12-12 01:34:56,060 INFO org.apache.ambari.metrics.core.timeline.availability.MetricCollectorHAController: ######################### Cluster HA state ######################## CLUSTER: ambari-metrics-cluster RESOURCE: METRIC_AGGREGATORS PARTITION: METRIC_AGGREGATORS_0 c3132-node2.user.local_12001 ONLINE PARTITION: METRIC_AGGREGATORS_1 c3132-node3.user.local_12001 ONLINE ################################################## According to the above message, the cluster has two collectors.

Online	Offline
Last Visited	‎04-10-2024 03:58 AM

Member Since	‎02-07-2019 08:28 PM
Last Visited	‎04-10-2024 03:58 AM
Posts	1,792
Kudos received	1

Cloudera Community

Support Video: How to move YARN Resource Manager t...

Support Video: How to enable YARN Resource Manager...

Support Video: How to enable HDFS Namenode High Av...

Support Video: How to troubleshoot Ranger Policy S...

Support Video: How to read Zookeeper transaction l...

Support Video: How to use reassign partitions tool...

Support Video: How to move Kafka Partition log dir...

Support Video: What is the difference between Amba...

Support Video: How to Configure Grafana to use CA ...

Support Video: How to configure Ambari Metrics Sys...