Support Questions

raghu_nt · ‎04-11-2019

We have a highly scalable kafka cluster with 11 nodes and another highly scalable Nifi cluster with 11 nodes that we are currently testing. These are created from our custom blueprint. They are deployed in Azure.

We find that everytime we stop and restart the cluster a couple of the nodes go unhealthy and we are unable to recover.

Do let us know how to resolve the issue.

raghu_nt · ‎04-11-2019

Just manually restart the stopped services and then run sync. That should resolve the issue

BUG-99581

The Event History in the Cloudbreak web UI displays the following message:

Manual recovery is needed for the following failed nodes: []

This message is displayed when Ambari agent doesn't send the heartbeat and Cloudbreak thinks that the host is unhealthy. However, if all services are green and healthy in Ambari web UI, then it is likely that the status displayed by Cloudbreak is incorrect.

If all services are green and healthy in Ambari web UI, then syncing the cluster should fix the problem.

View solution in original post

raghu_nt · ‎04-11-2019