About SumitraMenon

SumitraMenon · ‎04-02-2020

This is a short video tutorial to configure cross-realm trust between two secure (kerberized) clusters with different realm names. Cluster 1 (c1232) has the realm name SUPPORTLAB.CLOUDERA.COM and Cluster 2 (c4232) has the realm name COELAB.CLOUDERA.COM. This video explains the steps to set up a cross-realm trust in order to perform distcp operation. Open the video on YouTube here

SumitraMenon · ‎12-10-2019

This video describes how to use CA Signed Certificates for Ambari Metric System deployed in distributed mode with multiples Metrics Collectors. Open YouTube video here Ambari Metric System (AMS) HA Ambari Metrics System is an Ambari-native pluggable and scalable system for collecting and querying Hadoop Metrics, that includes Grafana, a powerful dashboard builder that is fully open source with a wide community adoption. By default, Metrics Collector is the REST API component that receives metrics payload as JSON over HTTP from the Sinks and Monitors. The metrics are written into the HBase storage layer which is dedicated storage for metric data and managed as a part of AMS, separate from the cluster HBase. The HBase schema is defined using Phoenix and all the read write operations from AMS are Phoenix jdbc API calls. The Sink implementations are native to AMS and are placed in the classpath of the supported Hadoop ecosystem services by Ambari. The Monitors are lightweight python daemons for system counters that use psutil native libraries for data collection. AMS can scale horizontally by adding additional Collector nodes which effectively adds additional HBase Regionserver(s) to handle increased read/write load. Ambari stack advisor is utilized to advise on AMS configurations proportional to the number of Sinks and monitors and thereby the cluster size. For this article, the CA has provided with a couple of PKCS#12 bundle of certificates called amc01.p12 and amc02.p12. Since is the same CA for both certificates, from one of them, you will get the CA certificates (root + Intermediates). This configuration assumes the following locations: /var/tmp/certificates/AMS. The path where will be copied the PKCS#12 bundles. /var/tmp/certificates/AMS/TRUSTSTORE. The path where will be created the truststore for all nodes . /var/tmp/certificates/AMS/KEYSTORE/{AMC01,AMC02}. The path to create the keystore for the collectors. /usr/jdk64/jdk1.8.0_112. The path for the java version installed. c3132-node1, c3132-node2, c3132-node3, c3132-node4. HDP Cluster Nodes. c3132-node1. Ambari Server. c3132-node2, c3132-node3. Cluster nodes configured as Ambari Metrics Collectors. /labs/AMS/truststore.jks. The path for truststore in all nodes. /labs/AMS/keystore.jks. The path for keystore in each of Ambari Metrics Collector. SSL Setup Logical Steps Basically, for each metrics collector, add the PKCS#12 bundle identified by an alias with the Metrics Collector FQDN with PrivateKeyEntry, and the RootCA and Intermediate certificates in a Truststore identified by an alias with trustedCertEntry. Every time Ambari starts the service, it will try to export the rootCA and intermediate certificates from the Truststore located in all nodes. First, it will try converting the Truststore from JKS format to PKCS12 format, then exporting all the CA certificates from the Truststore to its configuration directory creating the file called ca.pem. You could see the following messages from Ambari Operations Status Page. Execute['ambari-sudo.sh /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore /labs/AMS/truststore.jks -destkeystore /tmp/tmp0_1xE1/truststore.p12 -srcalias c3132-node3.user.local -deststoretype PKCS12 -srcstorepass hadoop1234 -deststorepass hadoop1234'] {} Execute['ambari-sudo.sh /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore /labs/AMS/truststore.jks -destkeystore /tmp/tmp0_1xE1/truststore.p12 -srcalias c3132-node2.user.local -deststoretype PKCS12 -srcstorepass hadoop1234 -deststorepass hadoop1234'] {} Execute['ambari-sudo.sh openssl pkcs12 -in /tmp/tmpI3YmtL/truststore.p12 -out /etc/ambari-metrics-monitor/conf/ca.pem -cacerts -nokeys -passin pass:hadoop1234'] {} Follow these steps to complete the previous setup. For this procedure, the node c3132-node2.user.local will hold the Active Ambari Metrics Collector. Since you received a couple of certificates bundle from the same Certificate Authority, you will Extract CA Certificates from one of the PKCS#12 Bundle cd /var/tmp/certificates/AMS && ls -l openssl pkcs12 -in c3132-node2.user.local.p12 -out rootca.crt -cacerts -nokeys -passin pass:hadoop1234 Create the truststore and add the CA Certificate. /usr/jdk64/jdk1.8.0_112/bin/keytool -keystore TRUSTSTORE/truststore.jks -alias caroot -import -file rootca.crt -storepass hadoop1234 /usr/jdk64/jdk1.8.0_112/bin/keytool -list -keystore TRUSTSTORE/truststore.jks Add to the truststore the PrivateCertEntry for all the Ambari Metrics Collectors using the FQDN as an alias /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore c3132-node2.user.local.p12 -alias c3132-node2.user.local -destkeystore TRUSTSTORE/truststore.jks -srcstoretype pkcs12 -deststoretype jks /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore c3132-node3.user.local.p12 -alias c3132-node3.user.local -destkeystore TRUSTSTORE/truststore.jks -srcstoretype pkcs12 -deststoretype jks /usr/jdk64/jdk1.8.0_112/bin/keytool -list -keystore TRUSTSTORE/truststore.jks Create the keystore for the first Ambari Metrics Collector adding the rootca as a TrustedCertEntry and server as a PrivateKeyEntry /usr/jdk64/jdk1.8.0_112/bin/keytool -keystore KEYSTORE/AMC01/keystore.jks -alias caroot -import -file rootca.crt -storepass hadoop1234 /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore c3132-node2.user.local.p12 -alias c3132-node2.user.local -destkeystore KEYSTORE/AMC01/keystore.jks -srcstoretype pkcs12 -deststoretype jks Create the keystore for the second Ambari Metrics Collector adding the rootca as a TrustedCertEntry and server as a PrivateKeyEntry /usr/jdk64/jdk1.8.0_112/bin/keytool -keystore KEYSTORE/AMC02/keystore.jks -alias caroot -import -file rootca.crt -storepass hadoop1234 /usr/jdk64/jdk1.8.0_112/bin/keytool -importkeystore -srckeystore c3132-node3.user.local.p12 -alias c3132-node3.user.local -destkeystore KEYSTORE/AMC02/keystore.jks -srcstoretype pkcs12 -deststoretype jks Copy the truststore to all nodes, including Ambari server and the keystore for each Ambari Metrics Collector for i in c3132-node1 c3132-node2 c3132-node3 c3132-node4 do ssh root@${i} "mkdir -p /labs/AMS" scp /var/tmp/certificates/AMS/TRUSTSTORE/truststore.jks root@${i}:/labs/AMS/ if [[ ${i} == "c3132-node2" ]] ; then scp /var/tmp/certificates/AMS/KEYSTORE/AMC01/keystore.jks root@${i}:/labs/AMS/ elif [[ ${i} == "c3132-node3" ]] ; then scp /var/tmp/certificates/AMS/KEYSTORE/AMC02/keystore.jks root@${i}:/labs/AMS/ else echo fi done From Ambari, configure the SSL properties (SSL Server/Client) to reference the Keystore and Truststore. AmbariUI / Services / Ambari Metrics / Configs / ams-site timeline.metrics.service.http.policy=HTTPS_ONLY ams-ssl-server ssl.server.keystore.keypassword=hadoop1234 ssl.server.keystore.location=/labs/AMS/keystore.jks ssl.server.keystore.password=hadoop1234 ssl.server.keystore.type=jks ssl.server.truststore.location=/labs/AMS/truststore.jks ssl.server.truststore.password=hadoop1234 ssl.server.truststore.reload.interval=10000 ssl.server.truststore.type=jks ams-ssl-client ssl.client.truststore.location=/labs/AMS/truststore.jks ssl.client.truststore.password=hadoop1234 ssl.client.truststore.type=jks AmbariUI -> Restart All Required Configure Ambari server to use https instead of http in all the requests to AMS Collector ssh root@c3132-node1 echo "server.timeline.metrics.https.enabled=true" >> /etc/ambari-server/conf/ambari.properties ambari-server setup-security Using python /usr/bin/python Security setup options... =========================================================================== Choose one of the following options: [1] Enable HTTPS for Ambari server. [2] Encrypt passwords stored in ambari.properties file. [3] Setup Ambari kerberos JAAS configuration. [4] Setup truststore. [5] Import certificate to truststore. =========================================================================== Enter choice, (1-5): 4 Do you want to configure a truststore [y/n] (y)? y TrustStore type [jks/jceks/pkcs12] (jks): Path to TrustStore file :/labs/AMS/truststore.jks Password for TrustStore: Re-enter password: Ambari Server 'setup-security' completed successfully. ambari-server restart From one of the Ambari Metrics Monitor validate the https comunnication. ssh root@c3132-node4 tail -f /var/log/ambari-metrics-monitor/ambari-metrics-monitor.log The following messages reflects HTTPS communication to the active Metrics Collector: 2018-12-12 02:27:11,835 [INFO] emitter.py:210 - Calculated collector shard based on hostname : c3132-node2.user.local 2018-12-12 02:27:11,835 [INFO] security.py:52 - SSL Connect being called.. connecting to https://c3132-node2.user.local:6188/ 2018-12-12 02:27:11,855 [INFO] security.py:43 - SSL connection established.

SumitraMenon · ‎12-10-2019

This video explains how to configure Spark2 to use HiveWarehouseConnector. Open the video on YouTube here To access Hive from Spark2 on HDP3, there are some requirements to meet and use the HiveWarehouseConnector. The configuration steps to use the HiveWarehouseConnector can be set at the Cluster level and/or Job level. This requires to collect base information from our Hive service to later on be configured on Spark2 via Ambari, or per application submission from a terminal providing the same configurations as arguments to the Spark2 client.

SumitraMenon · ‎12-10-2019

On HDP3, SparkSQL API will directly query Spark2 own catalog namespace. The Spark catalog is independent of the Hive catalog. Hence, a HiveWarehouseConnector was developed to allow Spark users to query Hive data through the HiveWarehouseSessionAPI. Hive tables on HDP3 are ACID by default, given that Spark2 does not operate on ACID tables yet. To guarantee data integrity, the HiveWarehouseConnector will process queries through the HiveServer2Interactive (LLAP) service. This is not the case for External tables. This video will explain how to access Hive from Spark2 on HDP3 along with some architectural changes and the support provided for particular use cases. Open the video on YouTube here

SumitraMenon · ‎12-10-2019

This video describes an easy to use Python script to generate data for Hive, based on an input table schema. This data generator for Hive solves the issue of loading data into tables with a lot of columns (such as more than 1500 columns). This automation script supports faster testing of queries and analyzing performance. To get the code, see the KB link (for customers only). Open the video on YouTube here

SumitraMenon · ‎12-10-2019

This Video Describes how Kafka ACLs work in HDP. This method is not supported in CDP7, please investigate Ranger Authorization for ACLs in CDP. Open the video on YouTube here Apache Kafka comes with an authorizer implementation that uses ZooKeeper to store all the ACLs. The ACLs have to be set because the access to resources is limited to super users when an authorizer is configured. By default, if a resource has no associated ACLs, then no one is allowed to access the resource, except super users. The following are the main ACL commands: Add ACLs: bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<zkHost>:<zkPort> --add --allow-principal User:<username> --operation All --topic <topicName> --group=* In the above command, ACLs are added to allow a principal to have All operations available over the topic specified. The following are the available operations: Read Write Create Delete Alter Describe ClusterAction DescribeConfigs AlterConfigs IdempotentWrite All When using --group=*, it means that all groups are allowed to be created by this user when running a Kafka consumer. The following is the command to list ACLs: bin/kafka-acls.sh --authorizer-properties zookeeper.connect=<zkHost>:<zkPort> --list In the above command, the available ACLs are listed for the Kafka cluster using --list. More details about ACLs options available in the following references: Authorization and ACLs ACLs command line interface

SumitraMenon · ‎12-10-2019

Many a times, it is necessary for a engineer/administrator to manipulate the content of Ambari-Infra-Solr using the command line utilities. They might or might not have access to the GUI interface. This video helps to understand the basic manipulation of: Listing collections and checking cluster status of Solr cloud. Creating new collections. Deleting the existing collections. To check if ambari-infra-solr server instance is running on the node, run the following: # ps -elf | grep -i infra-solr # netstat -plant | grep -i 8886 If the cluster is Kerberized. Check for valid kerberos tickets: # klist -A Obtain a kerberos ticket, if not present: # kinit -kt /etc/security/keytabs/ambari-infra-solr.service.keytab $(klist -kt /etc/security/keytabs/ambari-infra-solr.service.keytab |sed -n "4p"|cut -d ' ' -f7) List SOLR collections: curl --negotiate -u : "http://$(hostname -f):8886/solr/admin/collections?action=list" Create a collection: # curl --negotiate -u : "http://$(hostname -f): 8886/solr/admin/collections?action=CREATE&name=<collection_name>&numShards=<number>" The following are the optional Values: &maxShardsPerNode=<number> &replicationFactor=<number> Delete a collection: # curl --negotiate -u : "http://$(hostname -f): 8886/solr/admin/collections?action=DELETE&name=collection" Check status of the Solr Cloud cluster: # curl --negotiate -u : "http://$(hostname -f): 8886/solr/admin/collections?action=clusterstatus&wt=json" | python -m json.tool INDEX Keys: *solr_host = Host where solr instance(s) is running. *collection = Name of collection. *shard = Name of shard. *action = CREATE ( to add a collection(s) ) *action = DELETE ( to delete a collection(s) ) *action = CLUSTERSTATUS ( to get the list of available collection(s) in the Solr cloud cluster )

SumitraMenon · ‎12-10-2019

This video describes how to upgrade Ambari 2.6.2.2 to Ambari 2.7.3. Open the video on YouTube here Apache Ambari 2.7.3 is the latest among Ambari 2.7.x releases. Ambari 2.7.0, which was the first release in the 2.7.x series introduced significant improvements from its predecessor - Ambari 2.6.2. This video will help users upgrade from Ambari 2.6.2.2 to Ambari 2.7.3. Procedure I. Prerequisites Take a backup of the Ambari configuration file: # mkdir /root/backups # cp /etc/ambari-server/conf/ambari.properties /root/backups Turn off Service Auto Restart: From Ambari UI: Admin > Service Auto Start. Set Auto Start Services to Disabled. Click Save. Run Service Checks on all Ambari services. On each of the Ambari services installed on the cluster, run Service Checks as the following: From Ambari UI: <Service_Name> > Service Actions > Run Service Check For example: HDFS > Service Actions > Run Service Check. Start and Stop all of the Ambari services from Ambari UI. II. Stop Services If SmartSense is deployed, stop it and turn on Maintenance Mode. From Ambari Web, browse to Services > SmartSense and select Stop from the Service Actions menu. Then, select Turn on Maintenance Mode from the Service Actions menu. If Ambari Metrics is deployed, stop it and turn on Maintenance Mode. From Ambari Web, browse to Services > Log Search and select Stop from the Service Actions menu. Then, select Turn on Maintenance Mode from the Service Actions menu. If Log Search is run in the cluster, stop the service. From Ambari Web, browse to Services > Log Search and select Stop from the Service Actions menu. Then, select Turn on Maintenance Mode from the Service Actions menu. Stop Ambari server: # ambari-server stop Stop Ambari agents: # ambari-agent stop Backup Ambari database: # mysqldump -u ambari -p ambari > /root/backups/ambari-before-upgrade.sql III. Download Ambari 2.7.3 repository 1. Replace the old Ambari repository with the latest on on all hosts in the cluster # wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo -O /etc/yum.repos.d/ambari.repo 2. Upgrade Ambari server # yum clean all # yum upgrade ambari-server Note: If HDF components is deployed in the HDP setup, upgrade the HDF Management Pack before upgrading the Database Schema in step IV. For more details see HDF Upgrade the HDF Management Pack 3. Upgrade Ambari agents # yum clean all # yum upgrade ambari-agent IV. Upgrade Database Schema On the Ambari server host, upgrade the Ambari database schema: # ambari-server upgrade Start Ambari server: # ambari-server start Start Ambari agents: # ambari-agent start V. Verify Ambari version From the Ambari UI: Go to Admin > About:

SumitraMenon · ‎12-10-2019

This video describes a step by step process for getting an HDP 3 cluster up and running on Centos 7. The video follows the Hortonworks Documentation and Support Matrix recommendations. Public repositories were used for a minimal two node install on CentOS 7.5. Services installed on Ambari node: ambari-server Services installed on node1: SmartSense, Ambari Metrics. Open the video on YouTube here Get ready Clean yum cache yum clean all Rebuild cache yum makecache Install utilities yum install openssl openssh-clients curl unzip gzip tar wget Double-check free RAM memory in the system free -m Check limits configuration ulimit -n -u Set limits, temporarily ulimit -n 32768 ulimit -u 65536 Set limits, permanently vim /etc/security/limits.conf root - nofile 32768 root - nproc 65536 Generate RSA SSH key ssh-keygen Send public RSA key to node1 and configure it in the authorized keys file ssh-copy-id 10.200.82.41 Test passwordless connection ssh 10.200.82.41 Install NTP package yum install ntp -y Edit NTP conf file for setting ISO code, as shown in the first column of the Strum Time Servers http://support.ntp.org/bin/view/Servers/StratumOneTimeServers vim /etc/ntp.conf Start NTP service systemctl start ntpd Check if the service is running systemctl status ntpd Print the list of time servers the hosts are synchronizing with ntpq -p Check the time drift between the hosts and an NTP server ntpdate -q 0.centos.pool.ntp.org Set hostnames on the fly hostname ambari.local hostname node1.local Edit the etc hosts file for setting the IP-names mapping vim /etc/hosts 10.200.82.40 ambari.local ambari 10.200.82.41 node1.local node1 Edit the OS network file for setting the permanent host name vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=ambari.local NETWORKING=yes HOSTNAME=node1.local Run a new shell for getting the current host names to show in the promt bash Double-check the output of hostname with and without f. They should be the same. hostname hostname -f Disable firewalld while intalling systemctl disable firewalld Stop firewall service service firewalld stop Check if SELinux is currently in enforcing mode getenforce Set it to permisive (or disabled) setenforce 0 Check if it switched modes getenforce Download the public Apache Ambari repo file wget -nv http://public-repo1.hortonworks.com/ambari/centos7/2.x/updates/2.7.1.0/ambari.repo -O/etc/yum.repos.d/ambari.repo List currently configured repositories yum repolist Install Ambari Server Install ambari-server package yum install ambari-server Configure ambari-server ambari-server setup Start the service ambari-server start Deploy HDP cluster component Browse to the Ambari Server user interface (UI). Default username and password are both admin http://ambari.local:8080/ Take a look at the root user's private RSA file, the one generated before cat .ssh/id_rsa

SumitraMenon · ‎12-10-2019

From Ambari 2.6, for all MYSQL_SERVER components in a blueprint, the mysql-connector-java.jar needs to be manually installed and registered. This video describes how to install and register MySQL connector to replace the embedded database instance that is by default used by Ambari Server. Open YouTube video here For certain services, Cloudbreak allows registering an existing RDBMS instance as an external source for a database. After registering the RDBMS with Cloudbreak, it can be used for multiple clusters. However, as this configuration needs to be used by Ambari before its installation, MySQL Connector needs to be connected the remote MySQL database. To manually install and register MySQL connector, do the following: Preparing MySQL Database Server Install MySQL Server on CentOS Linux 7: # yum -y localinstall https://dev.mysql.com/get/mysql57-community-release-el7-8. noarch.rpm # yum -y install mysql-community-server # systemctl start mysqld.service Complete the MySQL initial setup. Depending on MySQL version, use user blank password for MySQL root or get the password from mysqld.log: # grep password /var/log/mysqld.log # mysql_secure_installation Create a user for Ambari, grant permissions and create the initial Database: # mysql -u root -p CREATE USER 'ambari'@'%' IDENTIFIED BY 'Hadoop1234!'; GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'%'; CREATE USER 'ambari'@'localhost' IDENTIFIED BY 'Hadoop1234!'; GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'localhost'; FLUSH PRIVILEGES; CREATE DATABASE ambari01; Configure Cloudbreak to use MySQL External Database Create a pre-ambari-start recipe to install the mysql-connector-java.jar: #!/bin/bash # Provide the JDBC Connector JAR file. # During cluster creation, Cloudbreak uses /opts/jdbc-drivers directory for the JAR file yum -y localinstall https://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm yum -y install mysql-connector-java* if [[ ! -d /opt/jdbc-drivers ]] then mkdir /opt/jdbc-drivers cp /usr/share/java/mysql-connector-java.jar /opt/jdbc-drivers/mysql-connector-java.jar fi Register the database configuration: Database: MySQL MySQL Server: MySQL_DB_IP/FQDN MySQL User: ambari MySQL Password: Hadoop1234! JDBC Connector JAR URL: Empty JDBC Connection jdbc:mysql://MySQL_DB_IP/FQDN:Port/ambari01

Online	Offline
Last Visited	‎04-10-2024 03:58 AM

Member Since	‎02-07-2019 08:28 PM
Last Visited	‎04-10-2024 03:58 AM
Posts	1,792
Kudos received	1

Cloudera Community

Support Video: How to set-up cross-realm trust bet...

Support Video: How to configure AMS HA to use CA S...

Support Video: How to configure Spark2 to use Hive...

Support Video: How to access Hive from Spark2 on H...

Support Video: How to generate Hive Random Data ba...

Support Video: How does Kafka ACLs work?

Support Video: How to list/ create/ delete collect...

Support Video: How to upgrade Ambari 2.6.2.2 to Am...

Support Video: How to install HDP 3.0?

Support Video: Deploying an HDP cluster using Clou...