Created 09-29-2015 03:55 PM
Created 09-29-2015 06:31 PM
The best-practice is to avoid the use of active Anti-Virus (AV) systems that monitor access to the underlying disk systems being used for metadata storage by the following processes:
These processes store data structures only, and there is nothing stored by these processes that is executable by the underlying OS. As these processes can be very active, potentially performing continuous writes against large files, the best performance requires direct, unimpeded access to the underlying filesystem, and any AV system that traps filesystem calls will have a negative impact on Hadoop system performance.
Some sites choose to implement AV "scans" that run periodically (like a weekly scan) on clients, gateway and "edge node" systems where users & developers connect and run local processes. These scans do not interfere with cluster performance, but are important to safeguard the edge-connected systems that are the main clients of the cluster.
Created 09-29-2015 06:31 PM
The best-practice is to avoid the use of active Anti-Virus (AV) systems that monitor access to the underlying disk systems being used for metadata storage by the following processes:
These processes store data structures only, and there is nothing stored by these processes that is executable by the underlying OS. As these processes can be very active, potentially performing continuous writes against large files, the best performance requires direct, unimpeded access to the underlying filesystem, and any AV system that traps filesystem calls will have a negative impact on Hadoop system performance.
Some sites choose to implement AV "scans" that run periodically (like a weekly scan) on clients, gateway and "edge node" systems where users & developers connect and run local processes. These scans do not interfere with cluster performance, but are important to safeguard the edge-connected systems that are the main clients of the cluster.
Created 10-02-2015 03:56 PM
Just a note that YARN may need to execute things that are placed into its local cache on the NMs, its not purly a data storage. This is why you cant have directories that are YARN related mounted as NOEXEC in /etc/fstab...
Created 10-02-2015 04:12 PM
Sometimes, the requirement to have AV on the servers is unavoidable due to security policies that cannot be challenged. In that event, prepare for the need to add significantly more nodes, more memory and more cpus to get the same levels of performance.