Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cannot terminate failed Cloudera Director deployment

Cannot terminate failed Cloudera Director deployment

Expert Contributor

Similar to: http://community.cloudera.com/t5/Cloudera-Director-Cloud-based/Cannot-terminate-cluster/m-p/77341

 

In this example, Cloudera Director fails to deploy all master instances in Azure.  I've seen this happen occassionally, with subsequent redeploys completing successfully.  However, this time I cannot terminate the cluster because Cloudera Director appears to be confused about the fact that the master instances do not exist - at least, that's my guess.  Also, about 1 minute into the termination attempt I see the following error:

 

 ERROR [p-44256946bec7-DefaultTerminateClusterJob] 3e44d1dd-563d-4c30-ac83-f273e5277e44 DELETE /api/v11/environments/Evil/deployments/Evil-1/clusters/EvilCorp-1 com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed
java.util.concurrent.TimeoutException: Not all instances terminated in 20 MINUTES as expected

Note the "20 minutes" - again, this is one minute into the termination attempt. 

 

We need a force delete option which will allow director to continue to delete all assets that it can, skipping those that it cannot.  I realize this could be dangerous - hence the "force" nomenclature.  As it is, I'm ending up with a stack a failed termination attempts of failed deployments.

2 REPLIES 2
Highlighted

Re: Cannot terminate failed Cloudera Director deployment

Expert Contributor

Are there currently orphaned resources (such as instances) in Azure? If so, can you try deleting them manually and then trying to terminate the deployment via the Cloudera Director UI?

 

Re: Cannot terminate failed Cloudera Director deployment

Expert Contributor

Hi.

 

You're thinking along the same lines I was.  That was the first thing I tried after clearning out resources in Azure, but to no avail.  Termination still fails.   This operation was much more robust on the AWS side whenever deployments failed.