Support Questions

Find answers, ask questions, and share your expertise

How to configure Prometheus JMX exporter using Ambari

avatar

We are new here and sing cloudbreak and hadoop as our data platform. We need to register a prometheus JMX exporter by adding following lines to hadoop-env.sh

export HDFS_NAMENODE_OPTS="${HDFS_NAMENODE_OPTS} -javaagent:/home/hduser/jmx_prometheus_javaagent-0.11.0.jar=19850:/home/hduser_/namenode.yml"

export HDFS_DATANODE_OPTS="${HDFS_DATANODE_OPTS} -javaagent:/home/hduser/jmx_prometheus_javaagent-0.11.0.jar=19851:/home/hduser_/datanode.yml"

When we ssh into nodes and change "/etc/hadoop/2.6.5.0-292/0/hadoop-env.sh", the changes are reverted on restart.

So we figured out the way and changes advanced hadoop-env from amabari management UI and added above lines. On restart, if we do netstat, we dont see any process listening on 19850 or 19851.

How can we configure the Prometheus JMX exporter?



9 REPLIES 9

avatar
Master Mentor

@Rahul Borkar

In an ambari managed cluster any changes made manually inside the scripts like "hadoop-env.sh" will be reverted back as soon as we restart those components from Ambari UI Because ambari will push the configs which are stored inside the ambari Db for those script templates to that host.


Hence for making such changes you must use the Ambari UI / APIs , Like "Advanced hadoop-env" from ambari.


After you made those changes from Ambari UI do you actually see the mentioned Java arguments when you try to run the following commands?

# ps -ef | grep -i NameNode
# ps -ef | grep -i DataNode


A. If you do not see those "javaagent" options in the above commands output then it means that those changes were not applied properly. In that case please share the full "Advanced hadoop-env" template from ambari UI so that we can check if it was applied properly or not?


B. If you are able to see the those "javaagent" options in the above process list command output then there may be something wrong in the either the file permissions "/home/hduser/jmx_prometheus_javaagent-0.11.0.jar" and "/home/hduser_/datanode.yml" Or the YAML file content might not be appropriate. As DataNode and NameNode processes run as "hdfs" user normally so please chekc the file permission and ownership.


Can you please share the /home/hduser_/datanode.yml file content as well ?

avatar

I not sure, how to add my reply here, and I posted it in Answer. Jay, please let me know if you are not able to see that.

avatar

The above question and the replies below were originally posted in the Community Help Track. On Wed Jun 5 13:57 UTC 2019, a member of the HCC moderation staff moved it to the Cloud & Operations track. The Community Help Track is intended for questions about using the HCC site itself.

Bill Brooks, Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar

Thanks Jay for the prompt answer,


I made changes to Advanced hadoop-env and here is my full template


      # Set Hadoop-specific environment variables here.

      # The only required environment variable is JAVA_HOME.  All others are
      # optional.  When running a distributed configuration it is best to
      # set JAVA_HOME in this file, so that it is correctly defined on
      # remote nodes.

      # The java implementation to use.  Required.
      export JAVA_HOME={{java_home}}
      export HADOOP_HOME_WARN_SUPPRESS=1

      # Hadoop home directory
      export HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}

      # Hadoop Configuration Directory

      {# this is different for HDP1 #}
      # Path to jsvc required by secure HDP 2.0 datanode
      export JSVC_HOME={{jsvc_path}}


      # The maximum amount of heap to use, in MB. Default is 1000.
      export HADOOP_HEAPSIZE="{{hadoop_heapsize}}"

      export HADOOP_NAMENODE_INIT_HEAPSIZE="-Xms{{namenode_heapsize}}"

      # Extra Java runtime options.  Empty by default.
      export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true ${HADOOP_OPTS}"

      USER="$(whoami)"

      # Command specific options appended to HADOOP_OPTS when specified
      HADOOP_JOBTRACKER_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{jtnode_opt_newsize}} -XX:MaxNewSize={{jtnode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xmx{{jtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dmapred.audit.logger=INFO,MRAUDIT -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA ${HADOOP_JOBTRACKER_OPTS}"

      HADOOP_TASKTRACKER_OPTS="-server -Xmx{{ttnode_heapsize}} -Dhadoop.security.logger=ERROR,console -Dmapred.audit.logger=ERROR,console ${HADOOP_TASKTRACKER_OPTS}"

      {% if java_version < 8 %}
      SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -XX:PermSize={{namenode_opt_permsize}} -XX:MaxPermSize={{namenode_opt_maxpermsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT"
      export HADOOP_NAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node\" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 ${HADOOP_NAMENODE_OPTS}"
      export HADOOP_DATANODE_OPTS="-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -XX:PermSize=128m -XX:MaxPermSize=256m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS} -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly"

      export HADOOP_SECONDARYNAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node\" ${HADOOP_SECONDARYNAMENODE_OPTS}"

      # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
      export HADOOP_CLIENT_OPTS="-Xmx${HADOOP_HEAPSIZE}m -XX:MaxPermSize=512m $HADOOP_CLIENT_OPTS"

      {% else %}
      SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT"
      export HADOOP_NAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node\" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 ${HADOOP_NAMENODE_OPTS}"
      export HADOOP_DATANODE_OPTS="-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS} -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly"

      export HADOOP_SECONDARYNAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node\" ${HADOOP_SECONDARYNAMENODE_OPTS}"

      # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
      export HADOOP_CLIENT_OPTS="-Xmx${HADOOP_HEAPSIZE}m $HADOOP_CLIENT_OPTS"
      {% endif %}

      HADOOP_NFS3_OPTS="-Xmx{{nfsgateway_heapsize}}m -Dhadoop.security.logger=ERROR,DRFAS ${HADOOP_NFS3_OPTS}"
      HADOOP_BALANCER_OPTS="-server -Xmx{{hadoop_heapsize}}m ${HADOOP_BALANCER_OPTS}"


      # On secure datanodes, user to run the datanode as after dropping privileges
      export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER:-{{hadoop_secure_dn_user}}}

      # Extra ssh options.  Empty by default.
      export HADOOP_SSH_OPTS="-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR"

      # Where log files are stored.  $HADOOP_HOME/logs by default.
      export HADOOP_LOG_DIR={{hdfs_log_dir_prefix}}/$USER

      # History server logs
      export HADOOP_MAPRED_LOG_DIR={{mapred_log_dir_prefix}}/$USER

      # Where log files are stored in the secure data environment.
      export HADOOP_SECURE_DN_LOG_DIR={{hdfs_log_dir_prefix}}/$HADOOP_SECURE_DN_USER

      # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
      # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves

      # host:path where hadoop code should be rsync'd from.  Unset by default.
      # export HADOOP_MASTER=master:/home/$USER/src/hadoop

      # Seconds to sleep between slave commands.  Unset by default.  This
      # can be useful in large clusters, where, e.g., slave rsyncs can
      # otherwise arrive faster than the master can service them.
      # export HADOOP_SLAVE_SLEEP=0.1

      # The directory where pid files are stored. /tmp by default.
      export HADOOP_PID_DIR={{hadoop_pid_dir_prefix}}/$USER
      export HADOOP_SECURE_DN_PID_DIR={{hadoop_pid_dir_prefix}}/$HADOOP_SECURE_DN_USER

      # History server pid
      export HADOOP_MAPRED_PID_DIR={{mapred_pid_dir_prefix}}/$USER

      YARN_RESOURCEMANAGER_OPTS="-Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY"

      # A string representing this instance of hadoop. $USER by default.
      export HADOOP_IDENT_STRING=$USER

      # The scheduling priority for daemon processes.  See 'man nice'.

      # export HADOOP_NICENESS=10

      # Add database libraries
      JAVA_JDBC_LIBS=""
      if [ -d "/usr/share/java" ]; then
      for jarFile in `ls /usr/share/java | grep -E "(mysql|ojdbc|postgresql|sqljdbc)" 2>/dev/null`
      do
      JAVA_JDBC_LIBS=${JAVA_JDBC_LIBS}:$jarFile
      done
      fi

      # Add libraries to the hadoop classpath - some may not need a colon as they already include it
      export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS}:/usr/lib/hadoop/lib/*

      # Setting path to hdfs command line
      export HADOOP_LIBEXEC_DIR={{hadoop_libexec_dir}}

      # Mostly required for hadoop 2.0
      export JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:{{hadoop_lib_home}}/native/Linux-{{architecture}}-64

      export HADOOP_OPTS="-Dhdp.version=$HDP_VERSION $HADOOP_OPTS"


      # Fix temporary bug, when ulimit from conf files is not picked up, without full relogin.
      # Makes sense to fix only when runing DN as root
      if [ "$command" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then
      {% if is_datanode_max_locked_memory_set %}
      ulimit -l {{datanode_max_locked_memory}}
      {% endif %}
      ulimit -n {{hdfs_user_nofile_limit}}
      fi

      {% if hadoop_custom_extensions_enabled %}
      #Enable custom extensions
      export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:{{stack_root}}/current/ext/hadoop/*:/usr/lib/hadoop/lib/*
      {% endif %}

      # Enable ACLs on zookeper znodes if required
      {% if hadoop_zkfc_opts is defined %}
      export HADOOP_ZKFC_OPTS="{{hadoop_zkfc_opts}} $HADOOP_ZKFC_OPTS"
      {% endif %}


#This is Rahul
export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -javaagent:/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar=19850:/home/sshhdfsuser/namenode.yml


These changes are getting updates into, /etc/hadoop/2.6.5.0-292/0/hadoop-env.sh

After I run " ps -ef | grep -i NameNode"

Following is the output,

hdfs      11560      1  2 14:21 ?        00:00:32 /usr/lib/jvm/java/bin/java -Dproc_namenode -Xmx1024m -Dhdp.version=2.6.5.0-292 -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop/hdfs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.6.5.0-292/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhdp.version=2.6.5.0-292 -Dhadoop.log.dir=/var/log/hadoop/hdfs -Dhadoop.log.file=hadoop-hdfs-namenode-dev-hdfs-restore-clstr-m0.log -Dhadoop.home.dir=/usr/hdp/2.6.5.0-292/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051421 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051421 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051421 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode
hdfs      13195      1  0 14:22 ?        00:00:07 /usr/lib/jvm/java/bin/java -Dproc_secondarynamenode -Xmx1024m -Dhdp.version=2.6.5.0-292 -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/hadoop/hdfs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.6.5.0-292/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhdp.version=2.6.5.0-292 -Dhadoop.log.dir=/var/log/hadoop/hdfs -Dhadoop.log.file=hadoop-hdfs-secondarynamenode-dev-hdfs-restore-clstr-m0.log -Dhadoop.home.dir=/usr/hdp/2.6.5.0-292/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.6.5.0-292/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051422 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node" -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051422 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node" -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/var/log/hadoop/hdfs/gc.log-201906051422 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-secondarynamenode/bin/kill-secondary-name-node" -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
root      21043 120868  0 14:46 pts/0    00:00:00 grep --color=auto -i NameNode
[root@dev-hdfs-restore-clstr-m0 sshhdfsuser]#

There is no something like "jmx_prometheus_javaagent" in above.


I have changed the permission of my "/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar" and "namenode.yml" to "777"

my namenode.yml is,

---
startDelaySeconds: 0
ssl: false
lowercaseOutputName: true
lowercaseOutputLabelNames: true
whitelistObjectNames:
  - 'Hadoop:service=NameNode,name=*'
  - 'Hadoop:service=NameNode,name=MetricsSystem,sub=*'
blacklistObjectNames:
  - 'Hadoop:service=NameNode,name=RetryCache.NameNodeRetryCache'
  - 'Hadoop:service=NameNode,name=RpcActivity*'
  - 'Hadoop:service=NameNode,name=RpcDetailedActivity*'
  - 'Hadoop:service=NameNode,name=UgiMetrics'
rules:
  # MetricsSystem
  - pattern: 'Hadoop<service=(.*), name=MetricsSystem, sub=(.*)><>(.*): (\d+)'
    attrNameSnakeCase: true
    name: hadoop_$1_$3
    value: $4
    labels:
      service: HDFS
      role: $1
      kind: 'MetricsSystem'
      sub: $2
    type: GAUGE
  # All NameNode infos
  - pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (\d+)'
    attrNameSnakeCase: true
    name: hadoop_$1_$3
    value: $4
    labels:
      service: HDFS
      role: $1
      kind: $2
    type: GAUGE


I am not sure what I am doing wrong here???

avatar
Master Mentor

@Rahul Borkar

Ambari Uses Jinja templates to create the files using the templates. Jinja has specific rules for variable substitution and is very strict about the missing quotes.

Your template has few issues like it has incorrectly entered new line characters at many places when {{VARIABLE}} is occurring in your template ... then it is converted to

{
{VARIABLE}} 


Example: (in your case)

      export JAVA_HOME={
  {java_home}}

Ideally it should be

      export JAVA_HOME={{java_home}}


Similarly your hadoop-env template has many new lines added in between {{ brackets.

The other main issue is that your last line as missing quotation mark

#This is Rahul
export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -javaagent:/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar=19850:/home/sshhdfsuser/namenode.yml


Ideally it should be

#This is Rahul
export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -javaagent:/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar=19850:/home/sshhdfsuser/namenode.yml"


.

.


avatar
Master Mentor

@Rahul Borkar

May be you can try with the following kind of template: (Please see the attached file.)

EDITED_hadoop-env_Template.txt

avatar

@Jay Kumar SenSharma

After updating advanced hadoop-env.sh with the txt file you posted. My cluster refuses to start,
109236-1559808666025.pngIf I check in the logs, Please find my attached log file.


I am concerned about the following block,

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386)
    at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401)
Caused by: java.lang.IllegalArgumentException: Collector already registered that provides name: jmx_exporter_build_info
    at io.prometheus.jmx.shaded.io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54)
    at io.prometheus.jmx.shaded.io.prometheus.client.Collector.register(Collector.java:139)
    at io.prometheus.jmx.shaded.io.prometheus.client.Collector.register(Collector.java:132)
    at io.prometheus.jmx.shaded.io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:51)

avatar

So after doing lot of research I got a success for namenode exporter,

Added following lines of code, at the end of file and got response on curl localhost:19850


if [ "$command" == "namenode" ]; then export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -javaagent:/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar=19850:/home/sshhdfsuser/namenode.yml" fi

Everything starts and runs smoothly.


Now when I try to do the same for Datanode, like adding the following

if [ "$command" == "datanode" ]; then export HADOOP_DATANODE_OPTS="${HADOOP_DATANODE_OPTS} -javaagent:/home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar=19851:/home/sshhdfsuser/datanode.yml" fi


Nothing starts and I get error messages in logs as per attached log file.

datanode_errors.txt

Following line is the concern I think and I am not sure, why behaviour for datanode is different than the namenode.

Error opening zip file or JAR manifest missing : /home/sshhdfsuser/jmx_prometheus_javaagent-0.11.0.jar


avatar
New Contributor

@Rahul Borkar

use the Ambari UI / APIs , Like "Advanced hadoop-env" from ambari. Added following lines of code, at the end of file


# Add java-agent to get jmx metrics for prometheus

agent_namenode=`echo $HADOOP_NAMENODE_OPTS | grep javaagent | wc -l `

if [ "$agent_namenode" == 0 ]; then

export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=10010 -javaagent:/usr/hdp/2.3.4.0-3485/hadoop/lib/jmx_prometheus_javaagent-0.11.0.jar=9998:/usr/hdp/2.3.4.0-3485/hadoop/jmx_exporter/namenode.yaml $HADOOP_NAMENODE_OPTS"

fi


agent_datanode=`echo $HADOOP_DATANODE_OPTS | grep javaagent | wc -l `

if [ "$agent_datanode" == 0 ]; then

export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=10011 -javaagent:/usr/hdp/2.3.4.0-3485/hadoop/lib/jmx_prometheus_javaagent-0.11.0.jar=9999:/usr/hdp/2.3.4.0-3485/hadoop/jmx_exporter/datanode.yaml $HADOOP_DATANODE_OPTS"

fi

changed the permission of jmx_prometheus_javaagent-0.11.0.jar" and "namenode.yml" and "datanode.yaml" to "777" and put it into the right place.


restart namenode and datanode at a time. you will get what you want