My cluster was hung. Was unable to add hosts or perform any basic activities in Ambari like restart of a service.
Was constantly seeing the WARN snippet in Ambari Server logs:
Unable to lookup the cluster by ID; assuming that there is no cluster and therefore no configs for this execution command: Cluster not found, clusterName=clusterID=-1
Here's a small hack to resolve the issue:
1. Check the cluster id in your backend Ambari DB. Mine is MySQL.
select * from clusterstate;
2. The same value found in step 1 should be there in Stage table's "cluster_id" columns
select stage_id, request_id, cluster_id from stage;
3. If there are values as -1 please update it to the correct value found in step 1. Example:
UPDATE stage SET cluster_id='2' WHERE request_id IN (383,384,388,389);
4. Restart Ambari-Server
5. Post this check by restarting any service like Grafana or any small service which not does impact the Hadoop service. If it proceeds, the cluster is now stable and you will be able to add nodes.
6. If issue persists, the perform the following in your backend Ambari DB.
SELECT * FROM host_role_command WHERE status='PENDING';
7. If you get any output, you need to update the status to "ABORTED".
UPDATE host_role_command SET status='ABORTED' WHERE status='PENDING';
8. Restart Ambari-Server
Validate the health of Ambari by restarting Grafana or any small service which not does impact the Hadoop service.
If everything is good, proceed by adding the nodes.