Adding cluster failed - how to troubleshoot?

by Cloudera Employee lwang ‎04-17-2017 07:21 PM - edited ‎04-25-2017 03:03 PM

Symptoms

Adding a cluster may fail with various error messages:

  • Cluster creation failed.
  • An internal error occurred while creating the cluster.
  • Your quota allows for 0 more running instance(s). You requested at least 3. Your quota allows for 0 more running instance(s). You requested at least 1.

Applies To

Adding a cluster in Cloudera Altus 

Cause

  • Certain error messages are straighforward and self-explains the cause of the issue.
  • SSH Private Key information may be invalid during the process adding a cluster.

 

Troubleshooting Steps

  • Regarding the quota error, you need to check with your Amazon EC2 admin to find out if the account has reached its limit. Please check below reference on how to check the limit.
  • Make sure the SSH Private Key used is valid when adding the cluster. Please check below reference on how to create the key pair.
  • Contact Cloudera Support for further help.
Comments
by uzubair
on ‎08-24-2017 01:08 PM - last edited on ‎08-29-2017 06:08 AM by Community Manager

Creating a cluster, i get Failed as status and the error message tells me to contact Cloduera.

 

Is there a way to troubleshoot using the logs on s3?

 

All machines are spun up and running, even cloudera manager is accessible but the cluster status is failed.

by Cloudera Employee Robert Justice
on ‎08-28-2017 12:27 PM

Hi @uzubair,

 

Just touching base to see if you were able to resolve this issue, and if not, were you able to create a support case through through the Altus Web UI (Altus Web UI -> Support -> Support Center -> Technical Support;  Component = Altus Data Engineering;  Sub-Component = Clusters)?

 

Thanks,

Robert Justice

by Cloudera Employee Anthony
on ‎08-29-2017 05:54 AM

Hi @uzubair,

 

Thanks for raising this to our attention.

 

You may be able to check the S3 bucket (if configured) for log output that may help with determining why the described symptoms occurred. 

 

Given the nature of this issue, are you able to create a support case through the Altus Web UI (Altus Web UI -> Support -> Support Center -> Technical Support;  Component = Altus Data Engineering;  Sub-Component = Clusters)?

 

Would you be able to provide some additional details within the support case as well, in particular:

 

  1. The output of the following command as well (if you have the Altus CLI installed):
    $ altus dataeng describe-clusters --cluster-name <cluster_name>
  2. Cluster Creation time
  3. # of Workers and Computer workers created,  and if spot instances were used
  4. Was there an Instance Bootstrap Script used?  If so can you attach that to the Support Case as well?
  5. Has any other Cluster creations failed in this same manner recently, or has this only been a single manifestation?

 

Kind Regards,

 

Anthony

by uzubair
on ‎08-29-2017 06:06 AM
I was able to resolve the issue...was able to run Spark jobs in Altus using Talend Big Data.

The issue had to do with the VPC setup in my Amazon environment. It wasn't setup properly. I picked a different region where the VPC is setup correctly and I was able to launch the cluster.
by Cloudera Employee Robert Justice
on ‎08-29-2017 06:56 AM

@uzubair,

 

Great to hear!  Thanks for getting back and letting us know.

by Cloudera Employee giladwolff
on ‎08-29-2017 10:15 AM

One typo correction to Anthony's reply above, the CLI command to retrieve information about a cluster is as 'describe-cluster' and not 'describe-clusters'.

Altus Community Navigation