Community Articles

Find and share helpful community-sourced technical articles.
avatar
Expert Contributor

Note: Cloudera does not support antivirus software of any kind.

 

This article contains general recommendations for excluding MapReduce components and directories from antivirus scans and monitoring.

 

The three primary locations you will want to exclude from antivirus are:

  1. Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
  2. Log directories: These are write-heavy.
  3. Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.

Note: Some directories in MapReduce are user-configurable. I recommend you exclude them. These properties can be found in Ambari > YARN > Configs > Advanced, and this one in particular should be excluded:

 

 

mapreduce.jobhistory.recovery.store.leveldb.path

 

 

Consider excluding the following directories and all of their subdirectories:

 

Installation, Configuration, and Libraries

 

 

/usr/hdp

/etc/hadoop

/var/lib/hadoop-mapreduce

 

 

Runtime and Logging

 

 

/var/run/hadoop-mapreduce

/var/log/hadoop-mapreduce

 

 

Note: HDFS, YARN, MapReduce, and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to also exclude the other components.

433 Views
0 Kudos