Support Questions

mike_bronson7 · ‎08-30-2018

we manage many hadoop clusters based on redhat OS ( version 7.x )

based on our experience ( many problem of low memory , disks performance , network problem , etc )

we agree that we need to install some monitoring tool that have ability to save the monitoring at least one month history details

from the link below https://neverendingsecurity.wordpress.com/tag/atop/

we saw a lot of monitoring tool and we not sure what is the best tool for hadoop clusters ,

meanwhile we install the atop tool that its fine ( but take a lot of space under /var/log/atop )

**but we still thinking if this is good selecting**

Michael-Bronson

JordanMoore · ‎09-01-2018

Nagios / OpsView / Sensu are popular options I've seen

StatsD / CollectD / MetricBeat are daemon metric collectors (MetricBeat is somewhat tied to an Elasticsearch cluster though) that run on each server

Prometheus is a popular option nowadays that would scrape metrics exposed by local service

I have played around a bit with netdata, though I'm not sure if it can be applied for Hadoop monitoring use cases.

DataDog is a vendor that offers lots of integrations such as Hadoop, YARN, Kafka, Zookeeper, etc.

... Realistically, you need some JMX + System monitoring tool, and a bunch exist

View solution in original post

kgautam · ‎08-30-2018

check_mk is what most use.
It is easy to configure provides you with a Nice UI with history saved.
The check_mk agents consume very less CPU and RAM hence avoiding any kind of any negative impact on any other application running on the Host.

mike_bronson7 · ‎08-30-2018

actually we think on tool that should installed on each linux machines , like the atop , the check_mk control the OS from WIN machines , and what we want is tool that give the info from the OS itself and runs on each OS itself

Michael-Bronson

stevenmatison · ‎08-30-2018

@Michael Bronson

i dont normally like to suggest non ASF options here in HCC, but have you checked out Elastic Beats? I am using MetricBeat to get unix cluster monitoring on our ambari nodes as well as windows workstations metrics such as:

CPU Used
Memory Used
Disk Used
Load Average
Inbound/Outbound Traffic
Host Processes
and more...

There is also a WinLog beat that allows us to tap into Windows Syslog and Performance Monitoring.

JordanMoore · ‎09-01-2018