Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Expert Contributor

Note: Cloudera does not support antivirus software of any kind.

 

This article contains general recommendations for excluding YARN components and directories from antivirus scans and monitoring.

 

The three primary locations you will want to exclude from antivirus are:

  1. Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
  2. Log directories: These are write-heavy.
  3. Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.

Note: The directories YARN uses are user-configurable. I recommend you exclude them. These properties can be found in Ambari > YARN > Configs > Advanced:

 

yarn.nodemanager.local-dirs
yarn.nodemanager.log-dirs
yarn.nodemanager.recovery.dir

yarn.timeline-service.leveldb-state-store.path
yarn.timeline-service.leveldb-timeline-store.path

 

Consider excluding the following directories and all of their subdirectories:

 

Installation, Configuration, and Libraries

 

 

 

/usr/hdp

/etc/hadoop

/var/lib/hadoop-yarn

 

 

 

Runtime and Logging

 

 

 

/var/run/hadoop-yarn

/var/log/hadoop-yarn

 

 

 

Note: HDFS, YARN, MapReduce, and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to also exclude the other components.

612 Views