Created on 05-16-2017 03:25 PM
Use recommended mount options for all HDFS data disks
There are specific filesystem mount options that have proven to be more efficient for Hadoop clusters. Using these mount options will provide performance benefits. Since mount options are applied when mounting the filesystem, on system boot or remounting for example, changes to /etc/fstab alone are not enough for these settings to take effect. The recommended approach is to make the mount option changes and either manually remount the individual file systems or reboot the host for the settings to take effect.
Use the following mount options for the respective file systems: ext4 —> "inode_readahead_blks=128","data=writeback","noatime","nodev" xfs —> "noatime"
Configure HDFS block size for optimal performance
Having optimal HDFS block size boosts NameNode performance as well as job execution performance.
Make sure that the blocksize ('dfs.blocksize' in 'hdfs-site.xml') is within the recommended range of 134217728 to 1073741824 (exclusive).
Enable HDFS short circuit reads
In HDFS, reads normally go through the DataNode. Thus, when the client asks the DataNode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a TCP socket. So-called short-circuit reads bypass the DataNode, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications.
Enable short circuit read for better performance. To configure short-circuit local reads, you will also need to enable libhadoop.so (dfs.domain.socket.path). In hdfs-site.xml set the below: dfs.client.read.shortcircuit=true dfs.domain.socket.path=/var/lib/hadoop-hdfs/dn_socket
Avoid file sizes that are smaller than a block size
Average block size should be greater than the recommended block size of 67108864 MB. An average size below the recommended size adds more burden to the NameNode, cause heap/GC issues in addition to cause storage and processing to be inefficient.
Set the block size greater than 67108864 MB Also, use one or more of the following techniques to consolidate smaller files : - run Hive/HBase compactions - merge small files - use HAR to compact small files
Tune DataNode JVM for optimal performance
DataNode is sensitive to the JVM performance and behavior. Make sure that the DataNode JVM is configured for optimal performance
Sample JVM Configs: -Djava.net.preferIPv4Stack=true, -XX:ParallelGCThreads=8, -XX:+UseConcMarkSweepGC, -Xloggc:*, -verbose:gc, -XX:+PrintGCDetails, -XX:+PrintGCTimeStamps, -XX:+PrintGCDateStamps Also: -Xms should be same as -Xmx New generation size should be ⅛ of the total JVM size.
Avoid reads or write from stale DataNodes
DataNodes that have not sent a heartbeat to NameNode for a defined interval, may be under load or may have died. Avoid sending any read/write requests to such 'stale' DataNodes.
In hdfs-site.xml set the below: dfs.namenode.avoid.read.stale.datanode=true dfs.namenode.avoid.write.stale.datanode=true
Use JNI-based group lookup over other implementations
Hadoop uses a pluggable interface with multiple possible implementations for looking up the group memberships of a user. The JNI-based implementation has better performance characteristics than other implementations.
In core-site.xml set: hadoop.security.group.mapping=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
*** This article focuses on settings that would improve HDFS performance. However, may impact other areas such as stability and uptime. Please understand the settings before applying them ***
*** You might also be interested in the following articles: ***
Created on 06-19-2017 03:00 PM
Could you please clarify what you mean by (exclusive) in this paragraph "Make sure that the blocksize ('dfs.blocksize'in'hdfs-site.xml')is within the recommended range of 134217728 to 1073741824(exclusive)" ?
You mean the maximum value used should be 1073741823 and we shouldn't use 1073741824 with is exactly 1GiB ?