Member since
12-09-2015
20
Posts
12
Kudos Received
0
Solutions
09-19-2019
12:20 PM
1 Kudo
This article from Stack Overflow may be helpful. https://stackoverflow.com/questions/17355270/recovering-black-listed-tasktrackers-in-mapreduce-job In general a blacklisted node can be brought back online by cleaning up the errors. You will need to look at the task tracker logs to see what was causing the failures. Let me know what happens. Ron
... View more
09-27-2017
03:48 AM
1 Kudo
The preferred configuration for integrating with Active
Directory is to use a standalone KDC and create a cross realm trust. I have
done several of these deployments on physical hardware. Recently I built a test
system on our Open Stack lab cluster using a small instance for the KDC. I
followed the instruction in the HDP Security guide for configuring a KDC. When
I created the database I noticed that the krb5util create –s command was
stalling out. I tried several fixes and it took way too long. I did some searching on Kerberos and learned how the
Kerberos utilities create the random data needed for encryption. The designers
of Kerberos wanted a truly random data generator. They decided to base their
random data generator on OS activities. There is a kernel parameter
/proc/sys/kernel/random/entropy_avail. You can cat this value to see how much
entropy your system has available. Since a VM is mostly idle you will get a
small value. RedHat provides a package called rng-tools that you can
install with yum. sudo yum
install rng-tools Then start rngd. sudo chkconfig rngd on
sudo service rngd start
You can cat the value of /proc/sys/kernel/random/entropy_avail
to see if you have increased the entropy in your VM. You should have a much higher value and you will see that
krb5util create –s complete in a few seconds. Reference documentation from RedHat. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sect-Security_Guide-Encryption-Using_the_Random_Number_Generator.html
... View more
03-25-2017
06:03 AM
3 Kudos
Overview
This document provides instructions and background information on how to change the Service accounts in an HDP cluster that is managed by Ambari.
Service accounts in Hortonworks Data Platform
The Hortonworks Data Platform uses services accounts to provide a common user to perform cluster operations and hold ownership of files and folders for a given service.
All major services are assigned services account. The Ambari web interface provides a list of service accounts used by a specific cluster. The table below gives a list of common service accounts.
Hadoop Service
User
Group
HDFS
hdfs
hadoop
YARN
yarn
hadoop
MapReduce
mapred
hadoop, mapred
Hive
hive
hadoop
HCatalog/WebHCatalog
hcat
hadoop
HBase
hbase
hadoop
Falcon
falcon
hadoop
Sqoop
sqoop
hadoop
ZooKeeper
zookeeper
hadoop
Oozie
oozie
hadoop
Knox Gateway
knox
hadoop
In the course of a standard Ambari installation the service accounts are created in the local Linux systems and assigned the appropriate privileges and ownerships. For many enterprise users service accounts are managed through services like Microsoft Active Directory. To be able to do this the service accounts need to be created before the cluster installation and the admin must override the default service accounts.
For US Bank the clusters have already been deployed and we need to retroactively change the services accounts. This document provides the process on how to perform this.
Ambari management of service accounts and other cluster metadata
For the Hortonworks Data Platform Ambari manages all cluster services and cluster metadata. Cluster metadata consists of parameter setting for services, access strings, and other settings that determine how the cluster services work.
Ambari uses a database to store all cluster metadata. The default database is Postgres but any SQL database such as MySQL, Oracle and others can be used.
Ambari policy defines that the cluster metadata in the Ambari database is considered to be the master values. Ambari will overwrite cluster component files with values in its database. This typically happens during service restarts. In order to change the service account names we will have to modify the Ambari database.
Ambari provides a REST api to allow us to make these changes.
Details on the Ambari REST api can be found at the Apache Ambari project web page at https://ambari.apache.org/.
Service accounts interaction with Isilon OneFS
For Hortonworks Data Platform 2.2.8 ownership and privileges between Hortonworks Data Platform and EMC Isilon OneFS has to be managed manually by the admins. It is outside of the scope of this document to detail how to do this. Detailed information can be found in the EMC Isilon OneFS docs.
Managing settings in Ambari
Ambari uses a database that is created during installation to record and manage all cluster settings. Ambri uses this database to update and refresh cluster settings when services are restarted. As such it is well know that you can not edit the common XML config files directly and expect the changes to persist. You must use the Ambari UI to update setting and then let Ambari update the config files.
Ambari provides a UI for most commonly used settings. However there are setting that are not exposed through the UI. There is a documented method or API for Ambari that allows administrators to interact with Ambari. The overall API uses a REST framework so use of the curl command can directly manipulate the API. The Ambari API is documented on the Apache Ambari web site at https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md.
To make things simpler there is a shell script called configs.sh which allows get and set operations. This script and the curl commands that work with it is documented at https://cwiki.apache.org/confluence/display/AMBARI/Modify+configurations. These are the basic tools used to modify the service account settings.
Finding and accessing cluster settings
Ambari organizes the settings into groups of configurations. You can list all the configuration using this command:
curl -u admin:admin -X GET http://AMBARI_SERVER_HOST:8080/api/v1/clusters/CLUSTER_NAME?fields=Clusters/desired_configs
Sample output:
curl -u admin:admin -X GET http://localhost:8080/api/v1/clusters/csaaL1?fields=Clusters
Partial output:
"Clusters" : {
"cluster_name" : "csaaL1",
"version" : "HDP-2.2",
"desired_configs" : {
"ams-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-hbase-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-hbase-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-hbase-policy" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-hbase-security-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-hbase-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ams-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"capacity-scheduler" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"cluster-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"core-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hadoop-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hadoop-policy" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hcat-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hdfs-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hdfs-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hive-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hive-exec-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hive-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hive-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"hiveserver2-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"mapred-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"mapred-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"pig-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"pig-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"pig-properties" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ranger-hdfs-plugin-properties" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ranger-hive-plugin-properties" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ssl-client" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"ssl-server" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"tez-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"tez-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"webhcat-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"webhcat-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"webhcat-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"yarn-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"yarn-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"yarn-site" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"zoo.cfg" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"zookeeper-env" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
},
"zookeeper-log4j" : {
"tag" : "version1",
"user" : "admin",
"version" : 1
}
}
}
}
You can inspect each group of configs by performing a get operation using the configs.sh script. We can look at hdfs-site by using this command:
/var/lib/ambari-server/resources/scripts/configs.sh get localhost csaaL1 hdfs-site
This will produce:
########## Performing 'GET' on (Site:hdfs-site, Tag:version1)
"properties" : {
"dfs.block.access.token.enable" : "true",
"dfs.blockreport.initialDelay" : "120",
"dfs.blocksize" : "134217728",
"dfs.client.read.shortcircuit" : "true",
"dfs.client.read.shortcircuit.streams.cache.size" : "4096",
"dfs.client.retry.policy.enabled" : "false",
"dfs.cluster.administrators" : " hdfs",
"dfs.datanode.address" : "0.0.0.0:50010",
"dfs.datanode.balance.bandwidthPerSec" : "6250000",
"dfs.datanode.data.dir" : "/mnt/hadoop/hdfs/data",
"dfs.datanode.data.dir.perm" : "750",
"dfs.datanode.du.reserved" : "1073741824",
"dfs.datanode.failed.volumes.tolerated" : "0",
"dfs.datanode.http.address" : "0.0.0.0:50075",
"dfs.datanode.https.address" : "0.0.0.0:50475",
"dfs.datanode.ipc.address" : "0.0.0.0:8010",
"dfs.datanode.max.transfer.threads" : "4096",
"dfs.domain.socket.path" : "/var/lib/hadoop-hdfs/dn_socket",
"dfs.encryption.key.provider.uri" : "",
"dfs.heartbeat.interval" : "3",
"dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude",
"dfs.http.policy" : "HTTP_ONLY",
"dfs.https.port" : "50470",
"dfs.journalnode.edits.dir" : "/hadoop/hdfs/journalnode",
"dfs.journalnode.http-address" : "0.0.0.0:8480",
"dfs.journalnode.https-address" : "0.0.0.0:8481",
"dfs.namenode.accesstime.precision" : "0",
"dfs.namenode.audit.log.async" : "true",
"dfs.namenode.avoid.read.stale.datanode" : "true",
"dfs.namenode.avoid.write.stale.datanode" : "true",
"dfs.namenode.checkpoint.dir" : "/mnt/hadoop/hdfs/namesecondary",
"dfs.namenode.checkpoint.edits.dir" : "${dfs.namenode.checkpoint.dir}",
"dfs.namenode.checkpoint.period" : "21600",
"dfs.namenode.checkpoint.txns" : "1000000",
"dfs.namenode.fslock.fair" : "false",
"dfs.namenode.handler.count" : "50",
"dfs.namenode.http-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:50070",
"dfs.namenode.https-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:50470",
"dfs.namenode.name.dir" : "/mnt/hadoop/hdfs/namenode",
"dfs.namenode.name.dir.restore" : "true",
"dfs.namenode.rpc-address" : "nn1-ronlee-hdp232.cloud.hortonworks.com:8020",
"dfs.namenode.safemode.threshold-pct" : "1",
"dfs.namenode.secondary.http-address" : "nn2-ronlee-hdp232.cloud.hortonworks.com:50090",
"dfs.namenode.stale.datanode.interval" : "30000",
"dfs.namenode.startup.delay.block.deletion.sec" : "3600",
"dfs.namenode.write.stale.datanode.ratio" : "1.0f",
"dfs.permissions.enabled" : "true",
"dfs.permissions.superusergroup" : "hdfs",
"dfs.replication" : "3",
"dfs.replication.max" : "50",
"dfs.support.append" : "true",
"dfs.webhdfs.enabled" : "true",
"fs.permissions.umask-mode" : "022"
}
If you have done Ambari or HDP admin these parameters will look familiar.
Back to the task at hand. How to find and change the service accounts. The service accounts are mostly in the configs that end in –env. So using the first command piped into grep we can get the configuration names. Then we can read the config and grep for user.
Here we look for the –env configs:
curl -u admin:admin -X GET http://localhost:8080/api/v1/clusters/csaaL1?fields=Clusters/desired_configs | grep "-env"
And this produces:
"ams-env" : {
"ams-hbase-env" : {
"cluster-env" : {
"hadoop-env" : {
"hcat-env" : {
"hive-env" : {
"mapred-env" : {
"pig-env" : {
"tez-env" : {
"webhcat-env" : {
"yarn-env" : {
"zookeeper-env" : {
Which we can then get the user name for hive using:
/var/lib/ambari-server/resources/scripts/configs.sh get localhost csaaL1 hive-env | grep user
Which produces:
"hcat_user" : "hcat",
"hive_user" : "hive",
"hive_user_nofile_limit" : "32000",
"hive_user_nproc_limit" : "16000",
"webhcat_user" : "hcat"
We now have a process to find the setting for the service accounts.
Implementation
We can now use the above commands to find all the service accounts.
Setting the service accounts
Once we have the existing names we can construct put operations to change the values. A separate put command is needed for each service account.
For example:
/var/lib/ambari-server/resources/scripts/configs.sh set <ambari_server_host> <cluster_name> hive-env "hive_user" "new_hive_service_account"
Details for the Tech_HDP cluster in the POC
We performed the following query:
curl -u admin:AmbAriTech -X GET http://localhost:8080/api/v1/clusters/tech_hdp?fields=Clusters/desired_configs | grep "-env"
This returned these results:
c030579@vmaksa69901bpv $ curl -u admin:AmbAriTech -X GET http://localhost:8080/api/v1/clusters/tech_hdp?fields=Clusters/desired_configs | grep "-env"
"ams-env" : {
"ams-hbase-env" : {
"cluster-env" : {
"falcon-env" : {
"flume-env" : {
"hadoop-env" : {
"hbase-env" : {
"hcat-env" : {
"hive-env" : {
"hst-env" : {
"kafka-env" : {
"kerberos-env" : {
"knox-env" : {
"mapred-env" : {
"oozie-env" : {
"pig-env" : {
"ranger-env" : {
"slider-env" : {
"spark-env" : {
"sqoop-env" : {
"storm-env" : {
"tez-env" : {
"webhcat-env" : {
"yarn-env" : {
"zookeeper-env" : {
c030579@vmaksa69901bpv $
Based on the results we were able to construct the following commands:
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-hbase-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp cluster-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp falcon-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp flume-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hadoop-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hbase-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hcat-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hst-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kafka-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kerberos-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp knox-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp mapred-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp oozie-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp pig-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ranger-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp slider-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp spark-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp sqoop-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp storm-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp tez-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp webhcat-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp yarn-env | grep user
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp zookeeper-env | grep user
Executing each command resulted in the environment variables names. This is the raw results.
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-env | grep user
"ambari_metrics_user" : "ams",
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ams-hbase-env | grep user
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp cluster-env | grep user
"ignore_groupsusers_create" : "true",
"smokeuser" : "ambari-qa",
"smokeuser_keytab" : "/etc/security/keytabs/smokeuser.headless.keytab",
"smokeuser_principal_name" : "ambari-qa@xxx.TEST-DNS.COM",
"user_group" : "hadoop"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp falcon-env | grep user
"falcon_user" : "falcon"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp flume-env | grep user
"flume_user" : "flume"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hadoop-env | grep user
"content" : "\n# Set Hadoop-specific environment variables here.\n\n# The only required environment variable is JAVA_HOME. All others are\n# optional. When running a distributed configuration it is best to\n# set JAVA_HOME in this file, so that it is correctly defined on\n# remote nodes.\n\n# The java implementation to use. Required.\nexport JAVA_HOME={{java_home}}\nexport HADOOP_HOME_WARN_SUPPRESS=1\n\n# Hadoop home directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# Hadoop Configuration Directory\n\n{# this is different for HDP1 #}\n# Path to jsvc required by secure HDP 2.0 datanode\nexport JSVC_HOME={{jsvc_path}}\n\n\n# The maximum amount of heap to use, in MB. Default is 1000.\nexport HADOOP_HEAPSIZE=\"{{hadoop_heapsize}}\"\n\nexport HADOOP_NAMENODE_INIT_HEAPSIZE=\"-Xms{{namenode_heapsize}}\"\n\n# Extra Java runtime options. Empty by default.\nexport HADOOP_OPTS=\"-Djava.net.preferIPv4Stack=true ${HADOOP_OPTS}\"\n\n# Command specific options appended to HADOOP_OPTS when specified\nexport HADOOP_NAMENODE_OPTS=\"-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -XX:PermSize={{namenode_opt_permsize}} -XX:MaxPermSize={{namenode_opt_maxpermsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_NAMENODE_OPTS}\"\nHADOOP_JOBTRACKER_OPTS=\"-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{jtnode_opt_newsize}} -XX:MaxNewSize={{jtnode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xmx{{jtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dmapred.audit.logger=INFO,MRAUDIT -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA ${HADOOP_JOBTRACKER_OPTS}\"\n\nHADOOP_TASKTRACKER_OPTS=\"-server -Xmx{{ttnode_heapsize}} -Dhadoop.security.logger=ERROR,console -Dmapred.audit.logger=ERROR,console ${HADOOP_TASKTRACKER_OPTS}\"\nexport HADOOP_DATANODE_OPTS=\"-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -XX:PermSize=128m -XX:MaxPermSize=256m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS}\"\nHADOOP_BALANCER_OPTS=\"-server -Xmx{{hadoop_heapsize}}m ${HADOOP_BALANCER_OPTS}\"\n\nexport HADOOP_SECONDARYNAMENODE_OPTS=$HADOOP_NAMENODE_OPTS\n\n# The following applies to multiple commands (fs, dfs, fsck, distcp etc)\nexport HADOOP_CLIENT_OPTS=\"-Xmx${HADOOP_HEAPSIZE}m -XX:MaxPermSize=512m $HADOOP_CLIENT_OPTS\"\n\n# On secure datanodes, user to run the datanode as after dropping privileges\nexport HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER:-{{hadoop_secure_dn_user}}}\n\n# Extra ssh options. Empty by default.\nexport HADOOP_SSH_OPTS=\"-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR\"\n\n# Where log files are stored. $HADOOP_HOME/logs by default.\nexport HADOOP_LOG_DIR={{hdfs_log_dir_prefix}}/$USER\n\n# History server logs\nexport HADOOP_MAPRED_LOG_DIR={{mapred_log_dir_prefix}}/$USER\n\n# Where log files are stored in the secure data environment.\nexport HADOOP_SECURE_DN_LOG_DIR={{hdfs_log_dir_prefix}}/$HADOOP_SECURE_DN_USER\n\n# File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default.\n# export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves\n\n# host:path where hadoop code should be rsync'd from. Unset by default.\n# export HADOOP_MASTER=master:/home/$USER/src/hadoop\n\n# Seconds to sleep between slave commands. Unset by default. This\n# can be useful in large clusters, where, e.g., slave rsyncs can\n# otherwise arrive faster than the master can service them.\n# export HADOOP_SLAVE_SLEEP=0.1\n\n# The directory where pid files are stored. /tmp by default.\nexport HADOOP_PID_DIR={{hadoop_pid_dir_prefix}}/$USER\nexport HADOOP_SECURE_DN_PID_DIR={{hadoop_pid_dir_prefix}}/$HADOOP_SECURE_DN_USER\n\n# History server pid\nexport HADOOP_MAPRED_PID_DIR={{mapred_pid_dir_prefix}}/$USER\n\nYARN_RESOURCEMANAGER_OPTS=\"-Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY\"\n\n# A string representing this instance of hadoop. $USER by default.\nexport HADOOP_IDENT_STRING=$USER\n\n# The scheduling priority for daemon processes. See 'man nice'.\n\n# export HADOOP_NICENESS=10\n\n# Use libraries from standard classpath\nJAVA_JDBC_LIBS=\"\"\n#Add libraries required by mysql connector\nfor jarFile in `ls /usr/share/java/mysql-connector-java-5.1.17.jar /usr/share/java/mysql-connector-java.jar 2>/dev/null`\ndo\n JAVA_JDBC_LIBS=${JAVA_JDBC_LIBS}:$jarFile\ndone\n# Add libraries required by oracle connector\nfor jarFile in `ls /usr/share/java/*ojdbc* 2>/dev/null`\ndo\n JAVA_JDBC_LIBS=${JAVA_JDBC_LIBS}:$jarFile\ndone\n# Add libraries required by nodemanager\nMAPREDUCE_LIBS={{mapreduce_libs_path}}\nexport HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS}:${MAPREDUCE_LIBS}\n\n# added to the HADOOP_CLASSPATH\nif [ -d \"/usr/hdp/current/tez-client\" ]; then\n if [ -d \"/etc/tez/conf/\" ]; then\n # When using versioned RPMs, the tez-client will be a symlink to the current folder of tez in HDP.\n export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/usr/hdp/current/tez-client/*:/usr/hdp/current/tez-client/lib/*:/etc/tez/conf/\n fi\nfi\n\n\n# Setting path to hdfs command line\nexport HADOOP_LIBEXEC_DIR={{hadoop_libexec_dir}}\n\n# Mostly required for hadoop 2.0\nexport JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}\n\nexport HADOOP_OPTS=\"-Dhdp.version=$HDP_VERSION $HADOOP_OPTS\"",
"hdfs_user" : "hdfs",
"hdfs_user_keytab" : "/etc/security/keytabs/hdfs.headless.keytab",
"proxyuser_group" : "users"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hbase-env | grep user
"hbase_user" : "hbase",
"hbase_user_keytab" : "/etc/security/keytabs/hbase.headless.keytab"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hcat-env | grep user
"content" : "\n # Licensed to the Apache Software Foundation (ASF) under one\n # or more contributor license agreements. See the NOTICE file\n # distributed with this work for additional information\n # regarding copyright ownership. The ASF licenses this file\n # to you under the Apache License, Version 2.0 (the\n # \"License\"); you may not use this file except in compliance\n # with the License. You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n\n JAVA_HOME={{java64_home}}\n HCAT_PID_DIR={{hcat_pid_dir}}/\n HCAT_LOG_DIR={{hcat_log_dir}}/\n HCAT_CONF_DIR={{hcat_conf_dir}}\n HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n #DBROOT is the path where the connector jars are downloaded\n DBROOT={{hcat_dbroot}}\n USER={{hcat_user}}\n METASTORE_PORT={{hive_metastore_port}}"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp hst-env | grep user
c030579@vmaksa69901bpv $
79@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kafka-env | grep user
"kafka_user" : "kafka"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp kerberos-env | grep user
"create_attributes_template" : "\n{\n \"objectClass\": [\"top\", \"person\", \"organizationalPerson\", \"user\"],\n \"cn\": \"$principal_name\",\n #if( $is_service )\n \"servicePrincipalName\": \"$principal_name\",\n #end\n \"userPrincipalName\": \"$normalized_principal\",\n \"unicodePwd\": \"$password\",\n \"accountExpires\": \"0\",\n \"userAccountControl\": \"66048\"\n}\n ",
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp knox-env | grep user
"knox_user" : "knox"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp mapred-env | grep user
"mapred_user" : "mapred"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp oozie-env | grep user
"oozie_user" : "oozie"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp pig-env | grep user
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp ranger-env | grep user
"admin_username" : "admin",
"ranger_admin_username" : "amb_ranger_admin",
"ranger_user" : "ranger",
"ranger_usersync_log_dir" : "/var/log/hadoop/ranger/usersync"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp slider-env | grep user
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp spark-env | grep user
"spark_user" : "spark"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp sqoop-env | grep user
"sqoop_user" : "sqoop"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp storm-env | grep user
"storm_user" : "storm"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp tez-env | grep user
"tez_user" : "tez"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp webhcat-env | grep user
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp yarn-env | grep user
"min_user_id" : "500",
"yarn_user" : "yarn"
c030579@vmaksa69901bpv $
c030579@vmaksa69901bpv $ /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AmbAriTech get localhost tech_hdp zookeeper-env | grep user
"zk_user" : "zookeeper",
c030579@vmaksa69901bpv $
Using these results we can create the specific get/put commands. The template for the command is:
/var/lib/ambari-server/resources/scripts/configs.sh set localhost tech_hdp <env name> "<key value>" "<new value>"
This table gives the environment names, keys and original values.
Service Account
Configuration
Setting name
Original Value
ams
ams-env
ambari_metrics_users
ams
ambari-qa
cluster-env
smokeuser
ambari-qa
falcon
falcon-env
falcon_user
falcon
flume
flume-env
flume_user
flume
hdfs
hadoop-env
hdfs_user
hdfs
hbase
hbase-env
hbase_user
hbase
hcat
hive-env
hcat_user
hcat
hive
hive-env
hive_user
hive
hcat
hive-env
webhcat_user
hcat
kafka
kafka-env
kafka_user
kafka
knox
knox-env
knox_user
knox
mapred
mapred-env
mapred_user
mapred
oozie
oozie-env
oozie_user
oozie
ranger
ranger-env
ranger_user
ranger
spark
spark-env
spark_user
spark
sqoop
sqoop-env
sqoop_user
sqoop
storm
storm-env
storm_user
storm
tez
tez-env
tez_user
tex
yarn
yarn-env
yarn_user
yarn
zookeeper
zookeeper-env
zk_user
zookeeper
Sample setting
Here is a sample set and restore of a value. This will show you how Ambari responds to the REST API calls.
We first query the environment to get the keyname and original value.
[root@mon-ronlee-hdp228~]# /var/lib/ambari-server/resources/scripts/configs.sh get localhost test ams-env | grep user
"ambari_metrics_user" : "ams",
[root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh set localhost test ams-env "ambari_metrics_user" "ronlee"
########## Performing 'set' ambari_metrics_user:ronlee on (Site:ams-env, Tag:version1)
########## Config found. Skipping origin value
########## PUTting json into: doSet_version1482208959734754638.json
{
"resources" : [
{
"href" : "http://localhost:8080/api/v1/clusters/test/configurations/service_config_versions?service_name=AMBARI_METRICS&service_config_version=2",
"configurations" : [
{
"clusterName" : "test",
"type" : "ams-env",
"versionTag" : "version1482208959734754638",
"version" : 2,
"serviceConfigVersions" : null,
"configs" : {
"content" : "\n# Set environment variables here.\n\n# The java implementation to use. Java 1.6 required.\nexport JAVA_HOME={{java64_home}}\n\n# Collector Log directory for log4j\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\n\n# Monitor Log directory for outfile\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\n\n# Collector pid directory\nexport AMS_COLLECTOR_PID_DIR={{ams_collector_pid_dir}}\n\n# Monitor pid directory\nexport AMS_MONITOR_PID_DIR={{ams_monitor_pid_dir}}\n\n# AMS HBase pid directory\nexport AMS_HBASE_PID_DIR={{hbase_pid_dir}}\n\n# AMS Collector heapsize\nexport AMS_COLLECTOR_HEAPSIZE={{metrics_collector_heapsize}}\n\n# AMS Collector options\nexport AMS_COLLECTOR_OPTS=\"-Djava.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Xmx$AMS_COLLECTOR_HEAPSIZE \"\n{% if security_enabled %}\nexport AMS_COLLECTOR_OPTS=\"$AMS_COLLECTOR_OPTS -Djava.security.auth.login.config={{ams_collector_jaas_config_file}}\"\n{% endif %}",
"ambari_metrics_user" : "ronlee",
"metrics_monitor_log_dir" : "/var/log/ambari-metrics-monitor",
"metrics_collector_heapsize" : "512m",
"metrics_collector_log_dir" : "/var/log/ambari-metrics-collector",
"metrics_monitor_pid_dir" : "/var/run/ambari-metrics-monitor",
"metrics_collector_pid_dir" : "/var/run/ambari-metrics-collector"
},
"configAttributes" : { }
}
],
"group_id" : null,
"group_name" : null,
"service_config_version" : 2,
"service_config_version_note" : null,
"service_name" : "AMBARI_METRICS"
}
]
}########## NEW Site:ams-env, Tag:version1482208959734754638
We can verify we changed the value by querying the environment again.
[root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh get localhost test ams-env | grep user
"ambari_metrics_user" : "ronlee",
We can then restore the value.
[root@mon-ronlee-hdp228 ~]# /var/lib/ambari-server/resources/scripts/configs.sh set localhost test ams-env "ambari_metrics_user" "ams"
########## Performing 'set' ambari_metrics_user:ams on (Site:ams-env, Tag:version1482208959734754638)
########## Config found. Skipping origin value
########## PUTting json into: doSet_version1482209050202685766.json
{
"resources" : [
{
"href" : "http://localhost:8080/api/v1/clusters/test/configurations/service_config_versions?service_name=AMBARI_METRICS&service_config_version=3",
"configurations" : [
{
"clusterName" : "test",
"type" : "ams-env",
"versionTag" : "version1482209050202685766",
"version" : 3,
"serviceConfigVersions" : null,
"configs" : {
"content" : "\n# Set environment variables here.\n\n# The java implementation to use. Java 1.6 required.\nexport JAVA_HOME={{java64_home}}\n\n# Collector Log directory for log4j\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\n\n# Monitor Log directory for outfile\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\n\n# Collector pid directory\nexport AMS_COLLECTOR_PID_DIR={{ams_collector_pid_dir}}\n\n# Monitor pid directory\nexport AMS_MONITOR_PID_DIR={{ams_monitor_pid_dir}}\n\n# AMS HBase pid directory\nexport AMS_HBASE_PID_DIR={{hbase_pid_dir}}\n\n# AMS Collector heapsize\nexport AMS_COLLECTOR_HEAPSIZE={{metrics_collector_heapsize}}\n\n# AMS Collector options\nexport AMS_COLLECTOR_OPTS=\"-Djava.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Xmx$AMS_COLLECTOR_HEAPSIZE \"\n{% if security_enabled %}\nexport AMS_COLLECTOR_OPTS=\"$AMS_COLLECTOR_OPTS -Djava.security.auth.login.config={{ams_collector_jaas_config_file}}\"\n{% endif %}",
"ambari_metrics_user" : "ams",
"metrics_monitor_log_dir" : "/var/log/ambari-metrics-monitor",
"metrics_collector_heapsize" : "512m",
"metrics_collector_log_dir" : "/var/log/ambari-metrics-collector",
"metrics_monitor_pid_dir" : "/var/run/ambari-metrics-monitor",
"metrics_collector_pid_dir" : "/var/run/ambari-metrics-collector"
},
"configAttributes" : { }
}
],
"group_id" : null,
"group_name" : null,
"service_config_version" : 3,
"service_config_version_note" : null,
"service_name" : "AMBARI_METRICS"
}
]
}########## NEW Site:ams-env, Tag:version1482209050202685766
[root@mon-ronlee-hdp228 ~]#
References
Reference docs for this
Apache Ambari web site – Update Service Accounts After Install
https://cwiki.apache.org/confluence/display/AMBARI/Update+Service-Accounts+After+Install
HCC - How to rename service account users in Ambari?
https://community.hortonworks.com/articles/49449/how-to-rename-service-account-users-in-ambari.html
Service accounts
General Discussion
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_set_up_service_user_accounts.html
Details
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_ambari_reference_guide/content/_defining_service_users_and_groups_for_a_hdp_2x_stack.html
... View more
Labels:
11-24-2016
04:36 AM
Thank you. Yes it is ancient. I am looking at justifying an upgrade.
... View more
11-21-2016
06:10 PM
@R c does this answer your question? thanks, Ron
... View more
11-21-2016
05:53 PM
I need to run a Views only server in Ambari 2.0.1. Does this work? It is not mentioned in the docs but I would expect it to work. Can someone confirm this?
... View more
Labels:
- Labels:
-
Apache Ambari
11-10-2016
06:25 PM
Hi! You can overcommit ram in esxi so this will work. Each VM will use what it needs up to the max. However since HDP tends to run at full performance you will run out of ram at some point. From a hardware point of view your server needs more ram. Most esxi servers with dual 12 core CPU will need 128G of ram to fully utilize your CPU resource.
... View more
10-12-2016
05:07 AM
Is there a good example of an Ambari Custom metric I can use as a template to write custom metrics for an app? Anyone interested in writing a tutorial?
... View more
Labels:
- Labels:
-
Apache Ambari
07-26-2016
03:08 AM
I will be configuring a HDP cluster with a standalone KDC. I see info on how to setup multiple KDCs for the HDP components. How should I set up the multiple KDCs? Can I create a master KDC with an HA pair? Has anyone deployed something like this in the real world?
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
06-20-2016
12:50 AM
The KDC should be on a separate machine because you will eventually have to turn it over to computer security since it is a source of authority for the principals. They should not let the HDP admins authorize their own accounts.
... View more