Reply
Highlighted
Contributor
Posts: 59
Registered: ‎10-02-2017

Cloudera Director 2.8 fails to terminate Azure resources

[ Edited ]

I am testing a new cluster deployment in Azure using Cloudera Director 2.8 and the bootstrap-remote command.

 

The cluster boostrap failed due to an issue with one of the boostrap scripts defined in the cluster config file.   So, I ran the terminate-remote command which consistently fails with:

 

Delete Failure - not all resources in Resource Group ca6c6e6a were deleted: Virtual Machiness:

Followed by a list of disk, network device and virtual machine resources.   The Director server logs don't provide many clues:

 

[2018-09-06 01:27:23.190 +0000] ERROR [p-873a85309f25-DefaultTerminateClusterJob] f769497c-af9a-4a9f-ae2c-6e9bb66a7dff DELETE /api/v12/environments/joeshacks/deployments/joeshacks/clusters/JoeShacks com.cloudera.launchpad.cleanup.TerminateInstances - c.c.l.p.DatabasePipelineRunner: (Suppressed)
com.cloudera.launchpad.pluggable.common.ExceptionConditions$DetailHolderException: Exception details:

[2018-09-06 01:27:23.193 +0000] ERROR [p-873a85309f25-DefaultTerminateClusterJob] f769497c-af9a-4a9f-ae2c-6e9bb66a7dff DELETE /api/v12/environments/joeshacks/deployments/joeshacks/clusters/JoeShacks com.cloudera.launchpad.cleanup.TerminateInstances - c.c.l.p.DatabasePipelineRunner: Pipeline '65c4185d-5e82-441d-9e64-873a85309f25' failed
	at com.cloudera.launchpad.cleanup.TerminateInstances$$EnhancerBySpringCGLIB$$a5e446ce
	at com.cloudera.launchpad.cleanup.TerminateInstancesAndWait:1

I've increase polling timouts in the Azure plugin, along with instance deletion timeout in application.properties and have tried terminating the cluster several times to no avail.

 

Finally, on the 7th try all Azure resources were deleted. 

 

I routinely test in AWS and have never had problems terminating / cleaning up failed deployments using Director. 

 

Any ideas why several termination attempts are required? 

Announcements