Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar

HDFS per-user Metrics aren't emitted by default. Kindly exercise caution before enabling them and make sure to refer to the details of client and service port numbers.

To be able to use the HDFS - Users dashboard in your Grafana instance as well as to view metrics for HDFS per user, you will need to add these custom properties to your configuration.

Step-by-step guide

Presumption for this guide:
This is a HA environment with dfs.internal.nameservices=nnha and dfs.ha.namenodes.nnha=nn1,nn2 in Ambari, HDFS > Configs > Advanced > Custom hdfs-site

1. In Ambari, HDFS > Configs > Advanced > Custom hdfs-site - Add the following properties.

dfs.namenode.servicerpc-address.<dfs.internal.nameservices>.nn1=<namenodehost1>:8050
dfs.namenode.servicerpc-address.<dfs.internal.nameservices>.nn2=<namenodehost2>:8050
ipc.8020.callqueue.impl=org.apache.hadoop.ipc.FairCallQueue
ipc.8020.backoff.enable=true
ipc.8020.scheduler.impl=org.apache.hadoop.ipc.DecayRpcScheduler
ipc.8020.scheduler.priority.levels=3
ipc.8020.decay-scheduler.backoff.responsetime.enable=true
ipc.8020.decay-scheduler.backoff.responsetime.thresholds=10,20,30

If you have already enabled Service RPC port, then you can avoid adding the first two lines about servicerpc-address.

Replace 8020 with your Namenode RPC port if it is different.

DO NOT replace it with Service RPC Port or DataNode Lifeline Port

2. After this change you may see issues like both namenodes as Active or both as Standby in Ambari.
To avoid this issue:

a. Stop the ZKFC on both NameNodes
b. Run the following command from one of the Namenode host as hdfs user

su - hdfs
hdfs zkfc -formatZK

c. Restart all ZKFC


3: Restart HDFS & you should see the metrics being emitted.

4: After a few minutes, you should also be able to use the HDFS - Users Dashboard in Grafana.

Things to ensure:

  1. Client port : 8020 (if different, replace it with appropriate port in all keys)
  2. Service port: 8021 (if different, replace it with appropriate port in first value)
  3. namenodehost1 and namenodehost2: needs to be replaced with actual values from the cluster and must be FQDN.
  4. dfs.internal.nameservices: needs to be replaced with acutal vallues from the cluster

Example:
dfs.namenode.servicerpc-address.nnha.nn1=<namenodehost1>:8050
dfs.namenode.servicerpc-address.nnha.nn2=<namenodehost2>:8050

* For more than 2 namenodes in your HA environment, please add one additional line for each additional namenode:
dfs.namenode.servicerpc-address.<dfs.internal.nameservices>.nnX=<namenodehostX>:8021

Adapted from this wiki which describes how to enable per user HDFS metrics for a non-HA environment.

Note : This article has been validated against Ambari-2.5.2 and HDP-2.6.2

It will not work in older versions of Ambari due to this BUG https://issues.apache.org/jira/browse/AMBARI-21640

1,424 Views