Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

what is the most best monitoring tool for hadoop clusters ( big data machines ) that should be installed on OS

avatar

we manage many hadoop clusters based on redhat OS ( version 7.x )

based on our experience ( many problem of low memory , disks performance , network problem , etc )

we agree that we need to install some monitoring tool that have ability to save the monitoring at least one month history details

from the link below https://neverendingsecurity.wordpress.com/tag/atop/

we saw a lot of monitoring tool and we not sure what is the best tool for hadoop clusters ,

meanwhile we install the atop tool that its fine ( but take a lot of space under /var/log/atop )

**but we still thinking if this is good selecting**

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Super Collaborator

Nagios / OpsView / Sensu are popular options I've seen

StatsD / CollectD / MetricBeat are daemon metric collectors (MetricBeat is somewhat tied to an Elasticsearch cluster though) that run on each server

Prometheus is a popular option nowadays that would scrape metrics exposed by local service

I have played around a bit with netdata, though I'm not sure if it can be applied for Hadoop monitoring use cases.

DataDog is a vendor that offers lots of integrations such as Hadoop, YARN, Kafka, Zookeeper, etc.

... Realistically, you need some JMX + System monitoring tool, and a bunch exist

View solution in original post

4 REPLIES 4

avatar

check_mk is what most use.
It is easy to configure provides you with a Nice UI with history saved.
The check_mk agents consume very less CPU and RAM hence avoiding any kind of any negative impact on any other application running on the Host.

avatar

actually we think on tool that should installed on each linux machines , like the atop , the check_mk control the OS from WIN machines , and what we want is tool that give the info from the OS itself and runs on each OS itself

Michael-Bronson

avatar
Super Guru
@Michael Bronson

i dont normally like to suggest non ASF options here in HCC, but have you checked out Elastic Beats? I am using MetricBeat to get unix cluster monitoring on our ambari nodes as well as windows workstations metrics such as:

  1. CPU Used
  2. Memory Used
  3. Disk Used
  4. Load Average
  5. Inbound/Outbound Traffic
  6. Host Processes
  7. and more...

There is also a WinLog beat that allows us to tap into Windows Syslog and Performance Monitoring.

avatar
Super Collaborator

Nagios / OpsView / Sensu are popular options I've seen

StatsD / CollectD / MetricBeat are daemon metric collectors (MetricBeat is somewhat tied to an Elasticsearch cluster though) that run on each server

Prometheus is a popular option nowadays that would scrape metrics exposed by local service

I have played around a bit with netdata, though I'm not sure if it can be applied for Hadoop monitoring use cases.

DataDog is a vendor that offers lots of integrations such as Hadoop, YARN, Kafka, Zookeeper, etc.

... Realistically, you need some JMX + System monitoring tool, and a bunch exist