Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Contributor

What is Apache Ambari?

Apache Ambari is an open-source software to install, manage and monitor Apache Hadoop family of components. It automates many of the basic actions performed and provides a simple and easy to use UI.

How does Ambari work?

Hadoop and its ecosystem of software are typically installed as a multi-node deployment. Ambari has a two level architecture of an Ambari Server and an Ambari agent. Ambari Server centrally manages all the agents and sends out operations to be performed on individual agents. Agents are installed by the server on each node (host) which in turn installs, configures and manages services in the agent

What are Services?

Services are the various components of the Hadoop ecosystem such as HDFS, YARN, Hive, HBase, Oozie, Druid, etc. One of the most popular open-source Hadoop distributions is the Hortonworks Data Platform (HDP)

How is a stack like HDP installed by Ambari?

  1. Each version of HDP corresponds to a version of Ambari which supports the HDP version.
  2. The latest Ambari version can be ascertained from docs.hortonworks.com
  3. Once the Ambari repository is downloaded and installed, Ambari shows the list of HDP versions it supports.
  4. Ambari also guides the users through an installation wizard which requests the users for details like the services to be installed, on which node, etc.

Ok, Ambari installed HDP. What else can it do?

  1. Ambari can also monitor and manage various services on Hadoop. For example, Ambari can start/stop services it manages, a user can add additional services, delete services, etc.
  2. The user can also get metrics/data about the health of the various services managed by Ambari
  3. Ambari also provides Views into some of the components like Hive, HBase, Pig, HDFS, etc., where a user can run queries and various jobs.
  4. Ambari also provides the users to edit their the service configurations and version those configurations so that at a later point in time, they can be restored if the changed configuration causes issues.

Where do I download the latest repositories for Ambari?

  1. For obtaining Ambari package with HDP cluster definitions, go to https://docs.hortonworks.com/ - select version - Apache Ambari Installation - Obtaining Public Repositories - Ambari Repositories
  2. Get the appropriate repository for the OS required

How to upgrade Ambari to a new version?

You can follow the guide for detailed steps on how to upgrade Ambari

Can Ambari upgrade HDP? How do I decide when to upgrade? Can I upgrade only specific service?

Yes Ambari can upgrade HDP. You can upgrade when a new release of HDP is announced by Hortonworks or if you’re looking for a specific feature which has landed in a new version of HDP. Upgrading only 1 service as part of cluster upgrade is not supported, however you can apply patch or maintenance upgrades to 2.6.4.x stack to a specific service.

Does Ambari support other stacks like HDF?

Yes. Other than HDP, Ambari paackage from Hortonworks supports other stacks like HCP.

How do I secure my cluster using Ambari?

  1. Kerberos authentication can be enabled from Ambari for network security
  2. Install Ranger and Configure basic authorization in Ranger from Ambari
  3. Ambari can be configured to use Knox SSO
  4. You can setup SSL for Ambari

Does Ambari support HA?

Not as of now. However, one can setup an active-passive ambari-server instance. Refer to the article for more details. Ambari Server HA is planned in a future release of Ambari: AMBARI-17126

I like Ambari alerts, However can I define my custom alert for a service?

This article explains how to create custom ambari alerts:

Where is the Ambari codebase? I heard its open source

Apache Ambari is completely open source with an Apache license. The code base is available in github.

How can I contribute to Ambari?

This wiki document explains how to contribute to Ambari

I want to perform scheduled maintenance on some of my cluster nodes? How will Ambari react to it? Stuff like adding a disk, replacing a node etc.

In Ambari, there is a maintenance mode option for all the services/hosts managed by it. One can switch on maintenance mode for the host/service affected by the maintenance which suppresses the alerts, and safely perform the maintenance operations.

How does Ambari decide the order in which various components should be installed on respective nodes?

Within Ambari, there is a finite state machine and a command orchestrator which manages all the dependencies of various components within it.

What is the significance of “ambari-qa” user?

'ambari-qa' user account is created by Ambari on all nodes in the cluster. This user performs a service check against cluster services as part of the install process. You can refer to the list of other users created while cluster installation.

I changed a config in a service and Ambari provided some recommendations for changes in other services, where are such recommendations coming from?

These recommendations are provided by a component called StackAdvisor. It is responsible for recommending various configurations at installation time and also maintaining the dependencies for the various services managed by Ambari.

How do I customize the configurations in Ambari Server?

  1. ambari.properties is located at /etc/conf/ambari-server/ambari.properties
  2. There are a set of properties with jdbc in the key. This is to configure the ambari database.
  3. There are another set of properties related to jdk and configuring the java version for ambari
  4. Another set of properties starting with “views” for configuring behaviour of ambari views.
  5. Security related configurations appear with the keyword “kerberos”, “security”, “jce”, etc
  6. You can run the ambari-server as a non-root user by specifying the username in “ambari-server.user”
  7. You can also specify timeouts for the common ambari installation tasks, e.g.: agent.package.install.task.timeout, agent.service.check.task.timeout, agent.task.timeout, server.task.timeout
  8. One can also set the time an Ambari login can be active by specifying the time in server.http.session.inactive_timeout

Can Ambari manage more than one cluster?

As of now, an Ambari instance can manage only one cluster. However, you can remotely view the “views” of another cluster in the same instance. You can read this blog post for more information

Ambari is cool, what’s next on the roadmap?

You can take a loot at all the release done and planned in Ambari here.

I have a Hadoop cluster. How can I start managing under Ambari ?

  1. If the cluster is not yet in production, clean up the cluster and install the cluster from scratch using Ambari, (after backing up the data, of course).
  2. If it production critical, then:
    1. Setup ambari-server and ambari database
    2. Install Update ambari-agents to point to the ambari-server
    3. Use Ambari APIs to perform cluster takeover i.e. add cluster, add hosts, register services and components, register host components. Refer here for Ambari APIs
    4. An alternative is to create an Ambari blueprint based on the current configuration and install the Cluster on Ambari using the blueprint.

Can I define my own custom service in an existing stack?

Refer here for details on how to create a custom service in Ambari

Can I define my own custom Ambari view?

Yes. Examples for views created for Ambari can be found here

Does Ambari authentication work with SSO?

Yes. You can use Knox SSO for connecting to an IDP for Ambari authentication. Follow the instructions for setting up knox sso for ambari.

What is first place to start troubleshooting an Ambari issue?

  1. Verify if ambari-server is up and running and ambari-server is able to communicate to all the ambari-agents.
  2. Perform a ambari database consistency check to make sure there are no database consistency errors. Run the following command on the ambari-server: ambari-server check-database
  3. Ambari server logs available at /var/log/ambari-server/ambari-server.log
  4. Ambari agent logs available at /var/log/ambari-agent/ambari-agent.log
  5. Ambari Agent task logs on any host with an Ambari Agent: /var/lib/ambari-agent/data/
    1. This location contains logs for all tasks executed on an Ambari Agent host. Each log name includes:
    1. command-N.json - the command file corresponding to a specific task.
    2. output-N.txt - the output from the command execution.
    3. errors-N.txt - error messages.
  6. You can configure the logging level for ambari-server.log by modifying /etc/ambari-server/conf/log4j.properties on the Ambari Server host. For the Ambari Agents, you can set the loglevel in/etc/ambari-agent/conf/ambari-agent.ini on each host running an Ambari Agent.
  7. You could also take a look at the troubleshooting guide for specific issues while installation, usage/upgrading a cluster using Ambari

HDP installation via Ambari failed. What options do I have?

  1. Try to re-run the steps from the Ambari console
  2. Restore to a previous snapshot, if available
  3. If your issue is not yet resolved, raise a support case if you’re a Hortonworks customer or post a question on HCC for further help

What if Ambari server host crashes? Recovery options?

  1. Maintaining a backup of Ambari Database for any changes to the cluster configuration is always recommended.
  2. If a backup is maintained, you can recover the host and install ambari-server afresh by pointing to the recovered database.
  3. If there is no backup, Ambari takeover can be performed by manually adding the hosts, cluster and services installed via Ambari APIs. Refer here for list of Ambari APIs and their functions

What happens when a node in a cluster running a master service component crashes?

One can attempt to recover the host via the ‘Recover Host’ option from the Ambari Web UI.

What happens when a node in a cluster running a slave service component crashes?

  1. One can attempt to recover the node (after recovering it manually) by performing the action ‘Recover Host’ from the Ambari UI.
  2. If the above action does not restore the cluster to its original state, follow the following steps:
    1. Clean up the ambari-agent and all other files on the node.
    2. Perform the ‘Add Host’ operation via Ambari UI to register the node as a new Node
    3. Select the master/slave components to be installed as part of the ‘Add Host’ wizard
3,079 Views