Created 01-19-2018 02:31 PM
we use ambari cluster with hadoop version 2.6 on our redhat linux machines version 7.3
after few month experience with ambari cluster & hadoop version - 2.6
we saw some tools/scripts that help to maintence the ambari & hadoop
but I am sure that we not covered all them ,
we want if it possible to get list of tools/scripts/third-party SW that can help us on the follwing tasks
1. ambari server /hadoop maintenance
2. verification/sanity utils/script for both ambari & hadoop 2.6
3. utils/scripts as that can help to trace a problem
4. etc
I will happy to get information about these
Created 01-20-2018 04:04 AM
Ambari is only okay if the agents are healthy and responding. You will at least need something like Nagios to check when services are down, disks are dead or full, fan stopped worked, RAM is bad, etc.
Personally, I'm a big fan of Ansible for running distributed SSH commands across the entire cluster. Ansible uses Jinja2 templates just like Ambari for templating out config files, it can start/stop services, sync files across machines, etc. Much better than ssh-ing to each host one by one. With the recent release of Ansible Tower, you can make a centralized location for all your Ansible scripts. Alternative tools such as Puppet/Chef exist, and many older infrastructures already have those tools in place elsewhere in their infrastructure. If you have RHEL, then Satellite might be worth using.
For tracing problems, you absolutely need some log collection framework and enabling JMX on every single Java / Hadoop process. You can pay for Splunk, or you can roll your own setup using Solr or Elasticsearch. Ambari recently added Ambari Infra and Log Search, which are backed by Solr. Lucidworks has a project named Banana that adds a nice dashboarding UI on top of Solr, although Grafana is also nice for dashboarding. If you go with Elasticsearch, it offers Logstash and Beats products that integrate well with many other external systems.
Created 01-20-2018 04:04 AM
Ambari is only okay if the agents are healthy and responding. You will at least need something like Nagios to check when services are down, disks are dead or full, fan stopped worked, RAM is bad, etc.
Personally, I'm a big fan of Ansible for running distributed SSH commands across the entire cluster. Ansible uses Jinja2 templates just like Ambari for templating out config files, it can start/stop services, sync files across machines, etc. Much better than ssh-ing to each host one by one. With the recent release of Ansible Tower, you can make a centralized location for all your Ansible scripts. Alternative tools such as Puppet/Chef exist, and many older infrastructures already have those tools in place elsewhere in their infrastructure. If you have RHEL, then Satellite might be worth using.
For tracing problems, you absolutely need some log collection framework and enabling JMX on every single Java / Hadoop process. You can pay for Splunk, or you can roll your own setup using Solr or Elasticsearch. Ambari recently added Ambari Infra and Log Search, which are backed by Solr. Lucidworks has a project named Banana that adds a nice dashboarding UI on top of Solr, although Grafana is also nice for dashboarding. If you go with Elasticsearch, it offers Logstash and Beats products that integrate well with many other external systems.