Created on 09-09-2016 01:44 AM
Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages.
However THP feature is known to perform poorly in Hadoop cluster and results in excessively high CPU utilization.
Disable THP to reduce the amount of system CPU utilization on your worker nodes. This can be done by ensuring that both proc entries are set to [never] instead of [always].
Some file systems offer better performance and stability than others. As such, the HDFS dfs.datanode.data.dir and YARN yarn.nodemanager.local-dirs should be configured to use mount points that are not formatted with the most optimal file systems.
Take a look at this article on file system choices: https://community.hortonworks.com/articles/14508/best-practices-linux-file-systems-for-hdfs.html
The Linux kernel provides a tweakable setting that controls how often the swap file is used, called swappiness.
A swappiness setting of zero means that the disk will be avoided unless absolutely necessary (when host runs out of memory), while a swappiness setting of 100 means that programs will be swapped to disk almost instantly.
Reducing the value for swappiness reduces the likelihood that the Linux kernel will push application memory from memory into swap space. Swap space is much slower than memory as it is backed by disk instead of RAM. Processes that are swapped to disk are likely to experience pauses, which may cause issues and missed SLAs.
Add `vm.swappiness=0` to /etc/sysctl.conf and reboot for the change to take effect. Or you can also change the value while your system is still running `sysctl -w vm.swappiness=0`. Also clear your swap by running `swapoff -a` and then `swapon -a` as root instead of rebooting to achieve the same effect.
The vm.dirty_background_ratio and vm.dirty_ratio parameters control the percentage of system memory that can be filled with memory pages that still need to be written to disk. Ratios too small force frequent IO operations, and too large leave too much data stored in volatile memory, so optimizing this ration is a careful balance between optimizing IO operations and reducing risk of data loss.
Update vm.dirty_background_ratio=20 and vm.dirty_ratio=50 in /etc/sysctl.conf and reboot for the change to take effect, or change the values while your system is still running using `sysctl -p`.
CPU Scaling is configurable and defaults commonly to favor power saving over performance. For Hadoop clusters, it is important that we configure then for better performance over other options.
Please set scaling governors to performance, which means running the CPU at maximum frequency. To do so run `cpufreq-set -r -g performance` OR edit /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor and set the content to 'performance'
SSDs provide great performance boost. If configured optimally for Hadoop workloads, they can provide even better results. Scheduler, read buffers, number of requests etc are the parameters to consider for tuning.
Refer following link for further details: https://wiki.archlinux.org/index.php/Solid_State_Drives#I.2FO_Scheduler
For all of SSD devices, set following things echo 'deadline' > {{device}}/queue/scheduler ; echo '256' > {{device}}/queue/read_ahead_kb ; echo '256' > {{device}}/queue/nr_requests ;
*** You might also be interested in the following articles: ***
Created on 10-20-2016 09:00 PM
Documentation may need to be updated, but with the new kernels, swappiness does not need to be set to 0. Read this article from @emaxwell:
https://community.hortonworks.com/articles/33522/swappiness-setting-recommendation.html