Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Use recommended mount options for all HDFS data disks

There are specific filesystem mount options that have proven to be more efficient for Hadoop clusters. Using these mount options will provide performance benefits. Since mount options are applied when mounting the filesystem, on system boot or remounting for example, changes to /etc/fstab alone are not enough for these settings to take effect. The recommended approach is to make the mount option changes and either manually remount the individual file systems or reboot the host for the settings to take effect.

Use the following mount options for the respective file systems:

ext4 —> "inode_readahead_blks=128","data=writeback","noatime","nodev"
xfs —> "noatime"

Configure HDFS block size for optimal performance

Having optimal HDFS block size boosts NameNode performance as well as job execution performance.

Make sure that the blocksize ('dfs.blocksize'  in 'hdfs-site.xml') is within the recommended range of 134217728 to 1073741824 (exclusive).

Enable HDFS short circuit reads

In HDFS, reads normally go through the DataNode. Thus, when the client asks the DataNode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a TCP socket. So-called short-circuit reads bypass the DataNode, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications.

Enable short circuit read for better performance. To configure short-circuit local reads, you will also need to enable libhadoop.so (dfs.domain.socket.path).

In hdfs-site.xml set the below:

dfs.client.read.shortcircuit=true
dfs.domain.socket.path=/var/lib/hadoop-hdfs/dn_socket

Avoid file sizes that are smaller than a block size

Average block size should be greater than the recommended block size of 67108864 MB. An average size below the recommended size adds more burden to the NameNode, cause heap/GC issues in addition to cause storage and processing to be inefficient.

Set the block size greater than 67108864 MB

Also, use one or more of the following techniques to consolidate smaller files :
- run Hive/HBase compactions
- merge small files
- use HAR to compact small files

Tune DataNode JVM for optimal performance

DataNode is sensitive to the JVM performance and behavior. Make sure that the DataNode JVM is configured for optimal performance

Sample JVM Configs:

-Djava.net.preferIPv4Stack=true, 
-XX:ParallelGCThreads=8, 
-XX:+UseConcMarkSweepGC, 
-Xloggc:*,
-verbose:gc, 
-XX:+PrintGCDetails, 
-XX:+PrintGCTimeStamps, 
-XX:+PrintGCDateStamps

Also:

-Xms should be same as -Xmx
New generation size should be ⅛ of the total JVM size.

Avoid reads or write from stale DataNodes

DataNodes that have not sent a heartbeat to NameNode for a defined interval, may be under load or may have died. Avoid sending any read/write requests to such 'stale' DataNodes.

In hdfs-site.xml set the below:

dfs.namenode.avoid.read.stale.datanode=true
dfs.namenode.avoid.write.stale.datanode=true

Use JNI-based group lookup over other implementations

Hadoop uses a pluggable interface with multiple possible implementations for looking up the group memberships of a user. The JNI-based implementation has better performance characteristics than other implementations.

In core-site.xml set:

hadoop.security.group.mapping=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback

*** This article focuses on settings that would improve HDFS performance. However, may impact other areas such as stability and uptime. Please understand the settings before applying them ***

*** You might also be interested in the following articles: ***

OS Configurations for Better Hadoop Performance

17,494 Views
Comments
avatar
New Contributor

Could you please clarify what you mean by (exclusive) in this paragraph "Make sure that the blocksize ('dfs.blocksize'in'hdfs-site.xml')is within the recommended range of 134217728 to 1073741824(exclusive)" ?

You mean the maximum value used should be 1073741823 and we shouldn't use 1073741824 with is exactly 1GiB ?