We have seen that a CDP AWS Datahub cluster was created but failed shortly after due to the master node becoming unreachable, resulting in errors like:
1) Cluster creation timeout
2) Unreachability
3) Abrupt failure due to unhealthy nodes
We have seen that our custom service "MY_SERVICE" was ruled out based on cluster logs and suspect network or connectivity issues.
We have also observed:
The Cloudera Manager instance in our CDP AWS Datahub environment is currently marked as unhealthy and is not responding. AWS diagnostics show that the instance status check has failed at the OS level, indicating a deeper issue with the instance configuration itself.