Community Articles
Find and share helpful community-sourced technical articles
Rising Star

Auto-recovery in Ambari is a useful way of getting cluster components restarted automatically in the event that a component fails (without the need for human intervention).

Ambari 2.4.0 introduced dynamic auto-recovery, which allows auto-start properties to be configured without needing an ambari-agent / ambari-server restart. Currently, the simplest way to manage the auto-recovery features within Ambari is via the REST API (documented within this article), although on-going work in the community will bring the feature to the UI:

Check Auto-Recovery Settings

To check if auto-recovery is enabled for all components, run the following command on the Ambari server node:

curl -u admin:<password> -i -H 'X-Requested-By: ambari' -X GET http://localhost:8080/api/v1/clusters/<cluster_name>/components?fields=ServiceComponentInfo/componen...

Note, you will need to replace with your own <password> and <cluster_name>.

The output of the above command will look something like this:

  "items" : [
      "href" : "http://localhost:8080/api/v1/clusters/horton/components/APP_TIMELINE_SERVER",
      "ServiceComponentInfo" : {
        "category" : "MASTER",
        "cluster_name" : "horton",
        "component_name" : "APP_TIMELINE_SERVER",
        "recovery_enabled" : "false",
        "service_name" : "YARN"
      "href" : "http://localhost:8080/api/v1/clusters/horton/components/DATANODE",
      "ServiceComponentInfo" : {
        "category" : "SLAVE",
        "cluster_name" : "horton",
        "component_name" : "DATANODE",
        "recovery_enabled" : "false",
        "service_name" : "HDFS"

Notice the "recovery_enabled" : "false" flag on each component.

Enable Auto-Recovery for HDP Components

To enable auto-recovery for a single component (in this case HBASE_REGIONSERVER):

curl -u admin:<password> -H "X-Requested-By: ambari" -X PUT 'http://localhost:8080/api/v1/clusters/<cluster_name>/components?ServiceComponentInfo/' -d '{"ServiceComponentInfo" : {"recovery_enabled":"true"}}'

To enable auto-recovery for multiple HDP components:


Enable Auto-Recovery for HDF Components

The process is the same for an Ambari managed HDF cluster, here is an example of enabling auto-recovery for the HDF services:

curl -u admin:<password> -H "X-Requested-By: ambari" -X PUT 'http://localhost:8080/api/v1/clusters/<cluster_name>/components?ServiceComponentInfo/,ZOOKEEPER_SERVER,KAFKA_BROKER,INFRA_SOLR,LOGSEARCH_LOGFEEDER,LOGSEARCH_SERVER,METRICS_COLLECTOR,METRICS_GRAFANA,METRICS_MONITOR)' -d '{"ServiceComponentInfo" : {"recovery_enabled":"true"}}'

Using an older version of Ambari?

If you're using an older version of Ambari (older than 2.4.0), check out the following ambari doc for details on how to enable auto-recovery via the file:


I got

{  "status": 500,
  "message": "Server Error"

at 2.5 HDP when try this solution