Created on 07-07-201604:13 AM - edited on 10-19-202110:47 PM by subratadas
With HDFS HA, the NameNode is no longer a single point of failure in a Hadoop cluster. However the performance of a single NameNode can often limit the performance of jobs across the cluster. The Apache Hadoop community has made a number of NameNode scalability improvements. This series of articles (also see part 2, part 3, part 4 and part 5) explains how you can make configuration changes to help your NameNode scale better and also improve its diagnosability.
This article is for Hadoop administrators who are familiar with HDFS and its components. If you are using Ambari or Cloudera Manager you should know how to manage services and configurations.
HDFS Audit Logs
The HDFS audit log is a separate log file that contains one entry for each user request. The audit log is a key resource for offline analysis of application traffic to find out users/jobs that generate excessive load on the NameNode. Careful analysis of the HDFS audit logs can also identify application bugs/misconfiguration.
Also, check the NameNode startup options in hadoop-env.sh to ensure there is no conflicting override for the hdfs.audit.logger setting. The following is an example of a correct HADOOP_NAMENODE_OPTS setting.
Note: If you enable HDFS audit logs, you should also enable Async Audit Logging as described in the next section.
Async Audit Logging
Using an async log4j appender for the audit logs significantly improves the performance of the NameNode. This setting is critical for the NameNode to scale beyond 10,000 requests/second. Add the following to your hdfs-site.xml.
If you are managing your cluster with Ambari, this setting is already enabled by default. If you're using CDH, you'll want to add the property to "NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml". For a CDP cluster, this property is exposed in the UI and enabled by default.
Service RPC port
The service RPC port gives the DataNodes a dedicated port to report their status via block reports and heartbeats. The port is also used by Zookeeper Failover Controllers for periodic health checks by the automatic failover logic. The port is never used by client applications hence it reduces RPC queue contention between client requests and DataNode messages.
For a non-HA cluster, the service RPC port can be enabled with the following setting in hdfs-site.xml (replacing mynamenode.example.com with the hostname or IP address of your NameNode). The port number can be something other than 8040.
If you have enabled Automatic Failover using ZooKeeper Failover Controllers, an additional step is required to reset the ZKFC state in ZooKeeper. Stop both the ZKFC processes and run the following command as the hdfs superuser.
hdfs zkfc –formatZK
Note: Changing the service RPC port settings requires a restart of the NameNodes, DataNodes and ZooKeeper Failover Controllers to take full effect. If you have NameNode HA setup, you can restart the NameNodes one at a time followed by a rolling restart of the remaining components to avoid cluster downtime.