Member since
10-01-2018
274
Posts
6
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
188 | 11-27-2024 12:50 PM | |
3681 | 09-28-2020 08:05 AM | |
3265 | 04-16-2020 09:20 AM | |
1618 | 04-16-2020 08:48 AM | |
4232 | 04-16-2020 08:10 AM |
09-29-2020
07:30 AM
I already had many alerts for errors before even trying to start the services. Here is a folder with screenshots of all the alerts.
... View more
05-11-2020
09:17 AM
Note: Cloudera does not support antivirus software of any kind.
This article contains general recommendations for excluding ZooKeeper components and directories from antivirus scans and monitoring.
The three primary locations you will want to exclude from antivirus are:
Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
Log directories: These are write-heavy.
Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.
Note: ZooKeeper has a user-configurable data directory. I recommend you exclude it. This directory can be found by running the following command:
# grep dataDir /etc/zookeeper/conf/zoo.cfg
Consider excluding the following directories and all of their subdirectories:
Installation, Configuration, and Libraries
/usr/hdp
/etc/hadoop
Runtime and Logging
/var/run/zookeeper
/var/log/zookeeper
Note: HDFS, YARN, MapReduce and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to also exclude the other components.
... View more
05-11-2020
09:17 AM
Note: Cloudera does not support antivirus software of any kind.
This article contains general recommendations for excluding MapReduce components and directories from antivirus scans and monitoring.
The three primary locations you will want to exclude from antivirus are:
Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
Log directories: These are write-heavy.
Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.
Note: Some directories in MapReduce are user-configurable. I recommend you exclude them. These properties can be found in Ambari > YARN > Configs > Advanced, and this one in particular should be excluded:
mapreduce.jobhistory.recovery.store.leveldb.path
Consider excluding the following directories and all of their subdirectories:
Installation, Configuration, and Libraries
/usr/hdp
/etc/hadoop
/var/lib/hadoop-mapreduce
Runtime and Logging
/var/run/hadoop-mapreduce
/var/log/hadoop-mapreduce
Note: HDFS, YARN, MapReduce, and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to also exclude the other components.
... View more
05-11-2020
09:17 AM
1 Kudo
Note: Cloudera does not support antivirus software of any kind.
This article contains general recommendations for excluding YARN components and directories from antivirus scans and monitoring.
The three primary locations you will want to exclude from antivirus are:
Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
Log directories: These are write-heavy.
Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.
Note: The directories YARN uses are user-configurable. I recommend you exclude them. These properties can be found in Ambari > YARN > Configs > Advanced:
yarn.nodemanager.local-dirs
yarn.nodemanager.log-dirs
yarn.nodemanager.recovery.dir
yarn.timeline-service.leveldb-state-store.path
yarn.timeline-service.leveldb-timeline-store.path
Consider excluding the following directories and all of their subdirectories:
Installation, Configuration, and Libraries
/usr/hdp
/etc/hadoop
/var/lib/hadoop-yarn
Runtime and Logging
/var/run/hadoop-yarn
/var/log/hadoop-yarn
Note: HDFS, YARN, MapReduce, and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to also exclude the other components.
... View more
05-11-2020
09:16 AM
1 Kudo
Note: Cloudera does not support antivirus software of any kind.
This article contains general recommendations for excluding HDFS components and directories from antivirus scans and monitoring.
The three primary locations you will want to exclude from antivirus are:
Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the antivirus holds up writes.
Log directories: These are write-heavy.
Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the antivirus holds up writes.
Note: The directories in HDFS are user-configurable. I recommend you exclude these, especially the data directory for the DataNode and the meta directories for the NameNode and JournalNode. These details can be found in the “hdfs-site.xml” file:
# grep -A1 "dir" /etc/hadoop/conf/hdfs-site.xml
Consider excluding the following directories and all of their subdirectories:
Installation, Configuration, and Libraries
/usr/hdp
/etc/hadoop
/var/lib/hadoop-hdfs
Runtime and Logging
/var/run/hadoop
/var/log/hadoop
Scratch and Temp
/tmp/hadoop-hdfs
Note: HDFS, YARN, MapReduce and ZooKeeper are mutually interdependent and you are likely to experience unsatisfactory results if you fail to exclude the other components.
... View more
05-11-2020
09:16 AM
1 Kudo
Note: Cloudera does not support antivirus software of any kind.
This article contains general recommendations for excluding Ambari components and directories from antivirus scans and monitoring.
The three primary locations you will want to exclude from antivirus are:
Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the AV holds up writes.
Log directories: These are write-heavy.
Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the AV holds up writes.
Note: Ambari has a special requirement in the form of a user-configurable database. I recommend you exclude this database. However, the details of this database are set on installation; the database may be colocated with ambari-server, or on a remote host. Consult with your database administrators for details on the path where the database information is stored; Ambari does not keep this information anywhere in its configuration. If you need details about which database Ambari is using, search for JDBC in the “amber.properties” file.
# grep 'jdbc' /etc/ambari-server/conf/ambari.properties
Consider excluding the following directories and all of their subdirectories:
Installation, Configuration, and Libraries
/usr/hdp
/usr/lib/ambari-agent
/usr/lib/ambari-server
/etc/hadoop
/etc/ambari-agent
/etc/ambari-server
/var/lib/ambari-agent
/var/lib/ambari-server
Runtime and Logging
/var/run/ambari-agent
/var/run/ambari-server
/var/log/ambari-agent
/var/log/ambari-server
... View more
04-30-2020
12:21 AM
Note: Cloudera does not support antivirus software of any kind. This article contains generic recommendations for excluding HDP components and directories from AV scans and monitoring. It is important to note that these recommendations do not apply to each service, and further, some services will have additional items to exclude which are unique to them. These details will be addressed in individual articles dedicated to the service in question. The three primary locations you will want to exclude from antivirus are: Data directories: These can be very large, and therefore take a long time to scan; they can also be very write-heavy, and therefore suffer performance impacts or failures if the AV holds up writes. Log directories: These are write-heavy. Scratch directories: These are internal locations used by some services for writing temporary data, and can also cause performance impacts or failures if the AV holds up writes. Consider excluding the following directories and all of their subdirectories: Installation, Configuration, and Libraries /hadoop /usr/hdp /etc/hadoop /etc/<component> /var/lib/<component> Runtime and Logging /var/run/<component> /var/log/<component> Scratch and Temp /var/tmp/<component> /tmp/<component> Note: The <component> does not only refer to the service name, as a given service may have multiple daemons with their own directories. Example: ambari-agent and ambari-server. Across HDP services there are also many user-configurable locations. Most of these can be found in Ambari properties with names like "service.scratch.dir" and "service.data.dir"; go to Ambari > Service > Configs > Advanced and search for any property containing "dir", all of which may be considered for exclusion.
... View more
04-17-2020
10:03 PM
Thank for you this reply! This has been quite difficult for me to troubleshoot, but I finally figured it out. These machines I've been using had chrony on them all along, but the previous machines I set up did not have chrony installed. Chrony and ntpd were both enabled, and ntpd was getting exited on reboot. Because the host monitor issues "ntpq -np", and ntpd was loaded but inactive, it would report a failure to query the server, even though chrony was running. I had no idea that chrony was installed, and thus, the whole problem could've been solved by just disabling/uninstalling ntpd. I spent WAY too many hours to come to such a simple solution. It may be very helpful to someone who doesn't understand network time protocols very well if there was a suggestion to explain potential conflicts between ntpd and chronyd in the documentation, or even to take a second to check which (if any) you already have installed. Maybe it won't be an issue for most people, but for me, assuming that I didn't have chrony already running cost me a bunch of time getting my cluster healthy. I would check, find ntpd dead, see no problems reported on Host Monitor, wonder why the hell ntpd died, kill ntpd, run ntpdate, restart ntpdate, restart scm-agent, and that would "fix" it, but on reboot it would go back to using chrony and exit ntpd, and host monitor would report failure to query ntp service, even though the machine was using chrony and synced just fine all along. I appreciate your help!
... View more
04-16-2020
09:20 AM
You are correct; the drivers are built for each platform. The HDP downloads page is here: https://www.cloudera.com/downloads/hdp.html It contains the JDBC41 driver for Hive.
... View more
04-16-2020
08:48 AM
You are correct; stoping a service is not the same as a service crashing. Alerts generally do not cover intentional administrator activity like starting and stopping of services. However, you do still have access to this information; starting and stopping of services are covered under Events, of the Audit type: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_dg_events.html#cmug_topic_10 The AUDIT_EVENT type covers actions performed. This is also where you will track configuration changes. Turning to the question of API use, here is the Cloudera Manager documentation's section on the API: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cloudera_manager.html#concept_nsg_jq3_mz Here is the Tutorial linked from that doc, which has a ton of examples, including starting and stopping of services: https://archive.cloudera.com/cm6/6.3.0/generic/jar/cm_api/apidocs/tutorial.html While the Alerts don't tell you when services are started and stopped, you can query Events through the API. We have a Knowledge Base Article on the subject: https://my.cloudera.com/knowledge/Accessing-Critical-Events-Using-the-Cloudera-Manager-API-?id=72521
... View more
- « Previous
-
- 1
- 2
- Next »