About rlee

rlee · ‎09-19-2019

This article from Stack Overflow may be helpful. https://stackoverflow.com/questions/17355270/recovering-black-listed-tasktrackers-in-mapreduce-job In general a blacklisted node can be brought back online by cleaning up the errors. You will need to look at the task tracker logs to see what was causing the failures. Let me know what happens. Ron

rlee · ‎09-27-2017

The preferred configuration for integrating with Active Directory is to use a standalone KDC and create a cross realm trust. I have done several of these deployments on physical hardware. Recently I built a test system on our Open Stack lab cluster using a small instance for the KDC. I followed the instruction in the HDP Security guide for configuring a KDC. When I created the database I noticed that the krb5util create –s command was stalling out. I tried several fixes and it took way too long. I did some searching on Kerberos and learned how the Kerberos utilities create the random data needed for encryption. The designers of Kerberos wanted a truly random data generator. They decided to base their random data generator on OS activities. There is a kernel parameter /proc/sys/kernel/random/entropy_avail. You can cat this value to see how much entropy your system has available. Since a VM is mostly idle you will get a small value. RedHat provides a package called rng-tools that you can install with yum. sudo yum install rng-tools Then start rngd. sudo chkconfig rngd on sudo service rngd start You can cat the value of /proc/sys/kernel/random/entropy_avail to see if you have increased the entropy in your VM. You should have a much higher value and you will see that krb5util create –s complete in a few seconds. Reference documentation from RedHat. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sect-Security_Guide-Encryption-Using_the_Random_Number_Generator.html

rlee · ‎03-25-2017

Overview This document provides instructions and background information on how to change the Service accounts in an HDP cluster that is managed by Ambari. Service accounts in Hortonworks Data Platform The Hortonworks Data Platform uses services accounts to provide a common user to perform cluster operations and hold ownership of files and folders for a given service. All major services are assigned services account. The Ambari web interface provides a list of service accounts used by a specific cluster. The table below gives a list of common service accounts. Hadoop Service User Group HDFS hdfs hadoop YARN yarn hadoop MapReduce mapred hadoop, mapred Hive hive hadoop HCatalog/WebHCatalog hcat hadoop HBase hbase hadoop Falcon falcon hadoop Sqoop sqoop hadoop ZooKeeper zookeeper hadoop Oozie oozie hadoop Knox Gateway knox hadoop In the course of a standard Ambari installation the service accounts are created in the local Linux systems and assigned the appropriate privileges and ownerships. For many enterprise users service accounts are managed through services like Microsoft Active Directory. To be able to do this the service accounts need to be created before the cluster installation and the admin must override the default service accounts. For US Bank the clusters have already been deployed and we need to retroactively change the services accounts. This document provides the process on how to perform this. Ambari management of service accounts and other cluster metadata For the Hortonworks Data Platform Ambari manages all cluster services and cluster metadata. Cluster metadata consists of parameter setting for services, access strings, and other settings that determine how the cluster services work. Ambari uses a database to store all cluster metadata. The default database is Postgres but any SQL database such as MySQL, Oracle and others can be used. Ambari policy defines that the cluster metadata in the Ambari database is considered to be the master values. Ambari will overwrite cluster component files with values in its database. This typically happens during service restarts. In order to change the service account names we will have to modify the Ambari database. Ambari provides a REST api to allow us to make these changes. Details on the Ambari REST api can be found at the Apache Ambari project web page at https://ambari.apache.org/. Service accounts interaction with Isilon OneFS For Hortonworks Data Platform 2.2.8 ownership and privileges between Hortonworks Data Platform and EMC Isilon OneFS has to be managed manually by the admins. It is outside of the scope of this document to detail how to do this. Detailed information can be found in the EMC Isilon OneFS docs. Managing settings in Ambari Ambari uses a database that is created during installation to record and manage all cluster settings. Ambri uses this database to update and refresh cluster settings when services are restarted. As such it is well know that you can not edit the common XML config files directly and expect the changes to persist. You must use the Ambari UI to update setting and then let Ambari update the config files. Ambari provides a UI for most commonly used settings. However there are setting that are not exposed through the UI. There is a documented method or API for Ambari that allows administrators to interact with Ambari. The overall API uses a REST framework so use of the curl command can directly manipulate the API. The Ambari API is documented on the Apache Ambari web site at https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md. To make things simpler there is a shell script called configs.sh which allows get and set operations. This script and the curl commands that work with it is documented at https://cwiki.apache.org/confluence/display/AMBARI/Modify+configurations. These are the basic tools used to modify the service account settings. Finding and accessing cluster settings Ambari organizes the settings into groups of configurations. You can list all the configuration using this command: curl -u admin:admin -X GET http://AMBARI_SERVER_HOST:8080/api/v1/clusters/CLUSTER_NAME?fields=Clusters/desired_configs Sample output: curl -u admin:admin -X GET http://localhost:8080/api/v1/clusters/csaaL1?fields=Clusters Partial output: "Clusters" : { "cluster_name" : "csaaL1", "version" : "HDP-2.2", "desired_configs" : { "ams-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-hbase-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-hbase-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-hbase-policy" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-hbase-security-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-hbase-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ams-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "capacity-scheduler" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "cluster-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "core-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hadoop-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hadoop-policy" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hcat-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hdfs-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hdfs-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hive-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hive-exec-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hive-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hive-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "hiveserver2-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "mapred-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "mapred-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "pig-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "pig-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "pig-properties" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ranger-hdfs-plugin-properties" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ranger-hive-plugin-properties" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ssl-client" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "ssl-server" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "tez-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "tez-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "webhcat-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "webhcat-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "webhcat-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "yarn-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "yarn-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "yarn-site" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "zoo.cfg" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "zookeeper-env" : { "tag" : "version1", "user" : "admin", "version" : 1 }, "zookeeper-log4j" : { "tag" : "version1", "user" : "admin", "version" : 1 } } } } You can inspect each group of configs by performing a get operation using the configs.sh script. We can look at hdfs-site by using this command: /var/lib/ambari-server/resources/scripts/configs.sh get localhost csaaL1 hdfs-site This will produce: ########## Performing 'GET' on (Site:hdfs-site, Tag:version1) "properties" : { "dfs.block.access.token.enable" : "true", "dfs.blockreport.initialDelay" : "120", "dfs.blocksize" : "134217728", "dfs.client.read.shortcircuit" : "true", "dfs.client.read.shortcircuit.streams.cache.size" : "4096", "dfs.client.retry.policy.enabled" : "false", "dfs.cluster.administrators" : " hdfs", "dfs.datanode.address" : "0.0.0.0:50010", "dfs.datanode.balance.bandwidthPerSec" : "6250000", "dfs.datanode.data.dir" : "/mnt/hadoop/hdfs/data", "dfs.datanode.data.dir.perm" : "750", "dfs.datanode.du.reserved" : "1073741824", "dfs.datanode.failed.volumes.tolerated" : "0", "dfs.datanode.http.address" : "0.0.0.0:50075", "dfs.datanode.https.address" : "0.0.0.0:50475", "dfs.datanode.ipc.address" : "0.0.0.0:8010", "dfs.datanode.max.transfer.threads" : "4096", "dfs.domain.socket.path" : "/var/lib/hadoop-hdfs/dn_socket", "dfs.encryption.key.provider.uri" : "", "dfs.heartbeat.interval" : "3", "dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude", "dfs.http.policy" : "HTTP_ONLY", "dfs.https.port" : "50470", "dfs.journalnode.edits.dir" : "/hadoop/hdfs/journalnode", "dfs.journalnode.http-address" : "0.0.0.0:8480", "dfs.journalnode.https-address" : "0.0.0.0:8481", "dfs.namenode.accesstime.precision" : "0", "dfs.namenode.audit.log.async" : "true", "dfs.namenode.avoid.read.stale.datanode" : "true", "dfs.namenode.avoid.write.stale.datanode" : "true", "dfs.namenode.checkpoint.dir" : "/mnt/hadoop/hdfs/namesecondary", "dfs.namenode.checkpoint.edits.dir" : "${dfs.namenode.checkpoint.dir}", "dfs.namenode.checkpoint.period" : "21600", "dfs.namenode.checkpoint.txns" : "1000000", "dfs.namenode.fslock.fair" : "false", "dfs.namenode.handler.count" : "50", "dfs.namenode.http-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:50070", "dfs.namenode.https-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:50470", "dfs.namenode.name.dir" : "/mnt/hadoop/hdfs/namenode", "dfs.namenode.name.dir.restore" : "true", "dfs.namenode.rpc-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:8020", "dfs.namenode.safemode.threshold-pct" : "1", "dfs.namenode.secondary.http-address" : "nn2-ronlee-hdp232.cloud.hortonworks.com:50090", "dfs.namenode.stale.datanode.interval" : "30000", "dfs.namenode.startup.delay.block.deletion.sec" : "3600", "dfs.namenode.write.stale.datanode.ratio" : "1.0f", "dfs.permissions.enabled" : "true", "dfs.permissions.superusergroup" : "hdfs", "dfs.replication" : "3", "dfs.replication.max" : "50", "dfs.support.append" : "true", "dfs.webhdfs.enabled" : "true", "fs.permissions.umask-mode" : "022" } If you have done Ambari or HDP admin these parameters will look familiar. Back to the task at hand. How to find and change the service accounts. The service accounts are mostly in the configs that end in –env. So using the first command piped into grep we can get the configuration names. Then we can read the config and grep for user. Here we look for the –env configs: curl -u admin:admin -X GET http://localhost:8080/api/v1/clusters/csaaL1?fields=Clusters/desired_configs | grep "-env" And this produces: "ams-env" : { "ams-hbase-env" : { "cluster-env" : { "hadoop-env" : { "hcat-env" : { "hive-env" : { "mapred-env" : { "pig-env" : { "tez-env" : { "webhcat-env" : { "yarn-env" : { "zookeeper-env" : { Which we can then get the user name for hive using: /var/lib/ambari-server/resources/scripts/configs.sh get localhost csaaL1 hive-env | grep user Which produces: "hcat_user" : "hcat", "hive_user" : "hive", "hive_user_nofile_limit" : "32000", "hive_user_nproc_limit" : "16000", "webhcat_user" : "hcat" We now have a process to find the setting for the service accounts. Implementation We can now use the above commands to find all the service accounts. Setting the service accounts Once we have the existing names we can construct put operations to change the values. A separate put command is needed for each service account. For example: /var/lib/ambari-server/resources/scripts/configs.sh set <ambari_server_host> <cluster_name> hive-env "hive_user" "new_hive_service_account" Details for the Tech_HDP cluster in the POC We performed the following query: curl -u admin:AmbAriTech -X GET http://localhost:8080/api/v1/clusters/tech_hdp?fields=Clusters/desired_configs | grep "-env" This returned these results: c030579@vmaksa69901bpv $ curl -u admin:AmbAriTech -X GET http://localhost:8080/api/v1/clusters/tech_hdp?fields=Clusters/desired_configs | grep "-env" "ams-env" : { "ams-hbase-env" : { "cluster-env" : { "falcon-env" : { "flume-env" : { "hadoop-env" : { "hbase-env" : { "hcat-env" : { "hive-env" : { "hst-env" : { "kafka-env" : { "kerberos-env" : { "knox-env" : { "mapred-env" : { "oozie-env" : { "pig-env" : { "ranger-env" : { "slider-env" : { "spark-env" : { "sqoop-env" : { "storm-env" : { "tez-env" : { "webhcat-env" : { "yarn-env" : { "zookeeper-env" : { c030579@vmaksa69901bpv $ Based on the results we were able to construct the following commands: /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-hbase-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp cluster-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp falcon-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp flume-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hadoop-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hbase-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hcat-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hst-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kafka-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kerberos-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp knox-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp mapred-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp oozie-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp pig-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ranger-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp slider-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp spark-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp sqoop-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp storm-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp tez-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp webhcat-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp yarn-env | grep user /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp zookeeper-env | grep user Executing each command resulted in the environment variables names. This is the raw results. c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-env | grep user "ambari_metrics_user" : "ams", c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-hbase-env | grep user c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp cluster-env | grep user "ignore_groupsusers_create" : "true", "smokeuser" : "ambari-qa", "smokeuser_keytab" : "/etc/security/keytabs/smokeuser.headless.keytab", "smokeuser_principal_name" : "ambari-qa@xxx.TEST-DNS.COM", "user_group" : "hadoop" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp falcon-env | grep user "falcon_user" : "falcon" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp flume-env | grep user "flume_user" : "flume" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hadoop-env | grep user "content" : "\n# Set Hadoop-specific environment variables here.\n\n# The only required environment variable is JAVA_HOME. All others are\n# optional. When running a distributed configuration it is best to\n# set JAVA_HOME in this file, so that it is correctly defined on\n# remote nodes.\n\n# The java implementation to use. Required.\nexport JAVA_HOME={{java_home}}\nexport HADOOP_HOME_WARN_SUPPRESS=1\n\n# Hadoop home directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# Hadoop Configuration Directory\n\n{# this is different for HDP1 #}\n# Path to jsvc required by secure HDP 2.0 datanode\nexport JSVC_HOME={{jsvc_path}}\n\n\n# The maximum amount of heap to use, in MB. Default is 1000.\nexport HADOOP_HEAPSIZE=\"{{hadoop_heapsize}}\"\n\nexport HADOOP_NAMENODE_INIT_HEAPSIZE=\"-Xms{{namenode_heapsize}}\"\n\n# Extra Java runtime options. Empty by default.\nexport HADOOP_OPTS=\"-Djava.net.preferIPv4Stack=true ${HADOOP_OPTS}\"\n\n# Command specific options appended to HADOOP_OPTS when specified\nexport HADOOP_NAMENODE_OPTS=\"-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -XX:PermSize={{namenode_opt_permsize}} -XX:MaxPermSize={{namenode_opt_maxpermsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_NAMENODE_OPTS}\"\nHADOOP_JOBTRACKER_OPTS=\"-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{jtnode_opt_newsize}} -XX:MaxNewSize={{jtnode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xmx{{jtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dmapred.audit.logger=INFO,MRAUDIT -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA ${HADOOP_JOBTRACKER_OPTS}\"\n\nHADOOP_TASKTRACKER_OPTS=\"-server -Xmx{{ttnode_heapsize}} -Dhadoop.security.logger=ERROR,console -Dmapred.audit.logger=ERROR,console ${HADOOP_TASKTRACKER_OPTS}\"\nexport HADOOP_DATANODE_OPTS=\"-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -XX:PermSize=128m -XX:MaxPermSize=256m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS}\"\nHADOOP_BALANCER_OPTS=\"-server -Xmx{{hadoop_heapsize}}m ${HADOOP_BALANCER_OPTS}\"\n\nexport HADOOP_SECONDARYNAMENODE_OPTS=$HADOOP_NAMENODE_OPTS\n\n# The following applies to multiple commands (fs, dfs, fsck, distcp etc)\nexport HADOOP_CLIENT_OPTS=\"-Xmx${HADOOP_HEAPSIZE}m -XX:MaxPermSize=512m $HADOOP_CLIENT_OPTS\"\n\n# On secure datanodes, user to run the datanode as after dropping privileges\nexport HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER:-{{hadoop_secure_dn_user}}}\n\n# Extra ssh options. Empty by default.\nexport HADOOP_SSH_OPTS=\"-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR\"\n\n# Where log files are stored. $HADOOP_HOME/logs by default.\nexport HADOOP_LOG_DIR={{hdfs_log_dir_prefix}}/$USER\n\n# History server logs\nexport HADOOP_MAPRED_LOG_DIR={{mapred_log_dir_prefix}}/$USER\n\n# Where log files are stored in the secure data environment.\nexport HADOOP_SECURE_DN_LOG_DIR={{hdfs_log_dir_prefix}}/$HADOOP_SECURE_DN_USER\n\n# File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default.\n# export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves\n\n# host:path where hadoop code should be rsync'd from. Unset by default.\n# export HADOOP_MASTER=master:/home/$USER/src/hadoop\n\n# Seconds to sleep between slave commands. Unset by default. This\n# can be useful in large clusters, where, e.g., slave rsyncs can\n# otherwise arrive faster than the master can service them.\n# export HADOOP_SLAVE_SLEEP=0.1\n\n# The directory where pid files are stored. /tmp by default.\nexport HADOOP_PID_DIR={{hadoop_pid_dir_prefix}}/$USER\nexport HADOOP_SECURE_DN_PID_DIR={{hadoop_pid_dir_prefix}}/$HADOOP_SECURE_DN_USER\n\n# History server pid\nexport HADOOP_MAPRED_PID_DIR={{mapred_pid_dir_prefix}}/$USER\n\nYARN_RESOURCEMANAGER_OPTS=\"-Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY\"\n\n# A string representing this instance of hadoop. $USER by default.\nexport HADOOP_IDENT_STRING=$USER\n\n# The scheduling priority for daemon processes. See 'man nice'.\n\n# export HADOOP_NICENESS=10\n\n# Use libraries from standard classpath\nJAVA_JDBC_LIBS=\"\"\n#Add libraries required by mysql connector\nfor jarFile in `ls /usr/share/java/mysql-connector-java-5.1.17.jar /usr/share/java/mysql-connector-java.jar 2>/dev/null`\ndo\n JAVA_JDBC_LIBS=${JAVA_JDBC_LIBS}:$jarFile\ndone\n# Add libraries required by oracle connector\nfor jarFile in `ls /usr/share/java/*ojdbc* 2>/dev/null`\ndo\n JAVA_JDBC_LIBS=${JAVA_JDBC_LIBS}:$jarFile\ndone\n# Add libraries required by nodemanager\nMAPREDUCE_LIBS={{mapreduce_libs_path}}\nexport HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS}:${MAPREDUCE_LIBS}\n\n# added to the HADOOP_CLASSPATH\nif [ -d \"/usr/hdp/current/tez-client\" ]; then\n if [ -d \"/etc/tez/conf/\" ]; then\n # When using versioned RPMs, the tez-client will be a symlink to the current folder of tez in HDP.\n export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/usr/hdp/current/tez-client/*:/usr/hdp/current/tez-client/lib/*:/etc/tez/conf/\n fi\nfi\n\n\n# Setting path to hdfs command line\nexport HADOOP_LIBEXEC_DIR={{hadoop_libexec_dir}}\n\n# Mostly required for hadoop 2.0\nexport JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}\n\nexport HADOOP_OPTS=\"-Dhdp.version=$HDP_VERSION $HADOOP_OPTS\"", "hdfs_user" : "hdfs", "hdfs_user_keytab" : "/etc/security/keytabs/hdfs.headless.keytab", "proxyuser_group" : "users" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hbase-env | grep user "hbase_user" : "hbase", "hbase_user_keytab" : "/etc/security/keytabs/hbase.headless.keytab" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hcat-env | grep user "content" : "\n # Licensed to the Apache Software Foundation (ASF) under one\n # or more contributor license agreements. See the NOTICE file\n # distributed with this work for additional information\n # regarding copyright ownership. The ASF licenses this file\n # to you under the Apache License, Version 2.0 (the\n # \"License\"); you may not use this file except in compliance\n # with the License. You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n\n JAVA_HOME={{java64_home}}\n HCAT_PID_DIR={{hcat_pid_dir}}/\n HCAT_LOG_DIR={{hcat_log_dir}}/\n HCAT_CONF_DIR={{hcat_conf_dir}}\n HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n #DBROOT is the path where the connector jars are downloaded\n DBROOT={{hcat_dbroot}}\n USER={{hcat_user}}\n METASTORE_PORT={{hive_metastore_port}}" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hst-env | grep user c030579@vmaksa69901bpv $ 79@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kafka-env | grep user "kafka_user" : "kafka" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kerberos-env | grep user "create_attributes_template" : "\n{\n \"objectClass\": [\"top\", \"person\", \"organizationalPerson\", \"user\"],\n \"cn\": \"$principal_name\",\n #if( $is_service )\n \"servicePrincipalName\": \"$principal_name\",\n #end\n \"userPrincipalName\": \"$normalized_principal\",\n \"unicodePwd\": \"$password\",\n \"accountExpires\": \"0\",\n \"userAccountControl\": \"66048\"\n}\n ", c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp knox-env | grep user "knox_user" : "knox" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp mapred-env | grep user "mapred_user" : "mapred" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp oozie-env | grep user "oozie_user" : "oozie" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp pig-env | grep user c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ranger-env | grep user "admin_username" : "admin", "ranger_admin_username" : "amb_ranger_admin", "ranger_user" : "ranger", "ranger_usersync_log_dir" : "/var/log/hadoop/ranger/usersync" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp slider-env | grep user c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp spark-env | grep user "spark_user" : "spark" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp sqoop-env | grep user "sqoop_user" : "sqoop" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp storm-env | grep user "storm_user" : "storm" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp tez-env | grep user "tez_user" : "tez" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp webhcat-env | grep user c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp yarn-env | grep user "min_user_id" : "500", "yarn_user" : "yarn" c030579@vmaksa69901bpv $ c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp zookeeper-env | grep user "zk_user" : "zookeeper", c030579@vmaksa69901bpv $ Using these results we can create the specific get/put commands. The template for the command is: /var/lib/ambari-server/resources/scripts/configs.sh set localhost tech_hdp <env name> "<key value>" "<new value>" This table gives the environment names, keys and original values. Service Account Configuration Setting name Original Value ams ams-env ambari_metrics_users ams ambari-qa cluster-env smokeuser ambari-qa falcon falcon-env falcon_user falcon flume flume-env flume_user flume hdfs hadoop-env hdfs_user hdfs hbase hbase-env hbase_user hbase hcat hive-env hcat_user hcat hive hive-env hive_user hive hcat hive-env webhcat_user hcat kafka kafka-env kafka_user kafka knox knox-env knox_user knox mapred mapred-env mapred_user mapred oozie oozie-env oozie_user oozie ranger ranger-env ranger_user ranger spark spark-env spark_user spark sqoop sqoop-env sqoop_user sqoop storm storm-env storm_user storm tez tez-env tez_user tex yarn yarn-env yarn_user yarn zookeeper zookeeper-env zk_user zookeeper Sample setting Here is a sample set and restore of a value. This will show you how Ambari responds to the REST API calls. We first query the environment to get the keyname and original value. [root@mon-ronlee-hdp228~]# /var/lib/ambari-server/resources/scripts/configs.sh get localhost test ams-env | grep user "ambari_metrics_user" : "ams", [root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh set localhost test ams-env "ambari_metrics_user" "ronlee" ########## Performing 'set' ambari_metrics_user:ronlee on (Site:ams-env, Tag:version1) ########## Config found. Skipping origin value ########## PUTting json into: doSet_version1482208959734754638.json { "resources" : [ { "href" : "http://localhost:8080/api/v1/clusters/test/configurations/service_config_versions?service_name=AMBARI_METRICS&service_config_version=2", "configurations" : [ { "clusterName" : "test", "type" : "ams-env", "versionTag" : "version1482208959734754638", "version" : 2, "serviceConfigVersions" : null, "configs" : { "content" : "\n# Set environment variables here.\n\n# The java implementation to use. Java 1.6 required.\nexport JAVA_HOME={{java64_home}}\n\n# Collector Log directory for log4j\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\n\n# Monitor Log directory for outfile\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\n\n# Collector pid directory\nexport AMS_COLLECTOR_PID_DIR={{ams_collector_pid_dir}}\n\n# Monitor pid directory\nexport AMS_MONITOR_PID_DIR={{ams_monitor_pid_dir}}\n\n# AMS HBase pid directory\nexport AMS_HBASE_PID_DIR={{hbase_pid_dir}}\n\n# AMS Collector heapsize\nexport AMS_COLLECTOR_HEAPSIZE={{metrics_collector_heapsize}}\n\n# AMS Collector options\nexport AMS_COLLECTOR_OPTS=\"-Djava.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Xmx$AMS_COLLECTOR_HEAPSIZE \"\n{% if security_enabled %}\nexport AMS_COLLECTOR_OPTS=\"$AMS_COLLECTOR_OPTS -Djava.security.auth.login.config={{ams_collector_jaas_config_file}}\"\n{% endif %}", "ambari_metrics_user" : "ronlee", "metrics_monitor_log_dir" : "/var/log/ambari-metrics-monitor", "metrics_collector_heapsize" : "512m", "metrics_collector_log_dir" : "/var/log/ambari-metrics-collector", "metrics_monitor_pid_dir" : "/var/run/ambari-metrics-monitor", "metrics_collector_pid_dir" : "/var/run/ambari-metrics-collector" }, "configAttributes" : { } } ], "group_id" : null, "group_name" : null, "service_config_version" : 2, "service_config_version_note" : null, "service_name" : "AMBARI_METRICS" } ] }########## NEW Site:ams-env, Tag:version1482208959734754638 We can verify we changed the value by querying the environment again. [root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh get localhost test ams-env | grep user "ambari_metrics_user" : "ronlee", We can then restore the value. [root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh set localhost test ams-env "ambari_metrics_user" "ams" ########## Performing 'set' ambari_metrics_user:ams on (Site:ams-env, Tag:version1482208959734754638) ########## Config found. Skipping origin value ########## PUTting json into: doSet_version1482209050202685766.json { "resources" : [ { "href" : "http://localhost:8080/api/v1/clusters/test/configurations/service_config_versions?service_name=AMBARI_METRICS&service_config_version=3", "configurations" : [ { "clusterName" : "test", "type" : "ams-env", "versionTag" : "version1482209050202685766", "version" : 3, "serviceConfigVersions" : null, "configs" : { "content" : "\n# Set environment variables here.\n\n# The java implementation to use. Java 1.6 required.\nexport JAVA_HOME={{java64_home}}\n\n# Collector Log directory for log4j\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\n\n# Monitor Log directory for outfile\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\n\n# Collector pid directory\nexport AMS_COLLECTOR_PID_DIR={{ams_collector_pid_dir}}\n\n# Monitor pid directory\nexport AMS_MONITOR_PID_DIR={{ams_monitor_pid_dir}}\n\n# AMS HBase pid directory\nexport AMS_HBASE_PID_DIR={{hbase_pid_dir}}\n\n# AMS Collector heapsize\nexport AMS_COLLECTOR_HEAPSIZE={{metrics_collector_heapsize}}\n\n# AMS Collector options\nexport AMS_COLLECTOR_OPTS=\"-Djava.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Xmx$AMS_COLLECTOR_HEAPSIZE \"\n{% if security_enabled %}\nexport AMS_COLLECTOR_OPTS=\"$AMS_COLLECTOR_OPTS -Djava.security.auth.login.config={{ams_collector_jaas_config_file}}\"\n{% endif %}", "ambari_metrics_user" : "ams", "metrics_monitor_log_dir" : "/var/log/ambari-metrics-monitor", "metrics_collector_heapsize" : "512m", "metrics_collector_log_dir" : "/var/log/ambari-metrics-collector", "metrics_monitor_pid_dir" : "/var/run/ambari-metrics-monitor", "metrics_collector_pid_dir" : "/var/run/ambari-metrics-collector" }, "configAttributes" : { } } ], "group_id" : null, "group_name" : null, "service_config_version" : 3, "service_config_version_note" : null, "service_name" : "AMBARI_METRICS" } ] }########## NEW Site:ams-env, Tag:version1482209050202685766 [root@mon-ronlee-hdp228 ~]# References Reference docs for this Apache Ambari web site – Update Service Accounts After Install https://cwiki.apache.org/confluence/display/AMBARI/Update+Service-Accounts+After+Install HCC - How to rename service account users in Ambari? https://community.hortonworks.com/articles/49449/how-to-rename-service-account-users-in-ambari.html Service accounts General Discussion http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_set_up_service_user_accounts.html Details http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_ambari_reference_guide/content/_defining_service_users_and_groups_for_a_hdp_2x_stack.html

rlee · ‎11-24-2016

Thank you. Yes it is ancient. I am looking at justifying an upgrade.

rlee · ‎11-21-2016

@R c does this answer your question? thanks, Ron

rlee · ‎11-21-2016

I need to run a Views only server in Ambari 2.0.1. Does this work? It is not mentioned in the docs but I would expect it to work. Can someone confirm this?

rlee · ‎11-10-2016

Hi! You can overcommit ram in esxi so this will work. Each VM will use what it needs up to the max. However since HDP tends to run at full performance you will run out of ram at some point. From a hardware point of view your server needs more ram. Most esxi servers with dual 12 core CPU will need 128G of ram to fully utilize your CPU resource.

rlee · ‎10-12-2016

Is there a good example of an Ambari Custom metric I can use as a template to write custom metrics for an app? Anyone interested in writing a tutorial?

rlee · ‎07-26-2016

I will be configuring a HDP cluster with a standalone KDC. I see info on how to setup multiple KDCs for the HDP components. How should I set up the multiple KDCs? Can I create a master KDC with an HA pair? Has anyone deployed something like this in the real world?

rlee · ‎06-20-2016

The KDC should be on a separate machine because you will eventually have to turn it over to computer security since it is a source of authority for the principals. They should not let the HDP admins authorize their own accounts.

Online	Offline
Last Visited	‎04-08-2022 02:22 PM

Member Since	‎12-09-2015 09:32 PM
Last Visited	‎04-08-2022 02:22 PM
Posts	20
Kudos received	12

Cloudera Community

Re: Black listed node in CDH 5.4.2 Hadoop cluster

Building a KDC on a virtual machine – why does krb...

How to change Ambari Services Account names using ...

Re: Can I run an Ambari Views only server with Amb...

Re: I am trying to install a Hadoop cluster on a D...

Can I run an Ambari Views only server with Ambari ...

Re: I am trying to install a Hadoop cluster on a D...

ambari custom metric template

Is there a recommended architecture/methods to sca...

Re: Keberos Implementation Approach: is it recomme...