Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudbreak not terminating nodes after auto-scale down of cluster

Solved Go to solution

Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor

Hi,

We are having a cloudbreak deployment for managing our HDP clusters where we have enabled auto-scaling of the cluster based on the Ambari metrics. After successful downscaling of the clusters, Cloudbreak fails to terminate/delete the VW's from Azure end and marks the removed host in cloudbreak as 'Unhealthy' node. Is there any workaround for the same?

Any help would be appreciated!

Thanks,

Cibi

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

@Cibi Chakaravarthi

I suppose you are using managed disks, if so then this is a known issue which got fixed in Cloudbreak release 1.16.5.

You can try to update following the documentation or you might try to launch Cloudbreak 1.16.5 from Azure Marketplace.

Sorry for the inconvenience & I hope this helps!

10 REPLIES 10

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

@Cibi Chakaravarthi

What is your Cloudbreak version ("cbd version")? Could you attach some logs (cbreak.log file or the output of "cbd logs")?

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor

Cloudbreak version is 1.16.4. From the logs it shows the node's Status as decommissioned after the scaling down event. But is not terminated, which needs to be done manually. I couldn't find any events in logs which shows failure on termination of the decommissioned nodes.

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

@Cibi Chakaravarthi There should be some useful information in the logs (there is no sensitive data in it), so please attach it to the case to be able to investigate.

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor

@pdarvasi I was able to find the below ERROR from the CBD log. I can't share the LOG file since there are some sensitive data in it.

/cbreak_cloudbreak_1 | 2017-12-05 18:01:07,913 [http-nio-8080-exec-2] getImage:40 DEBUG c.s.c.s.ComponentConfigProvider - [owner:d312e73a-f6dc-4e83-9452-bde66b18791f] [type:cloudbreakLog] [id:undefined] [name:cb] Image found! stackId: 1, component: Component{id=2, componentType=IMAGE, name='IMAGE'} /cbreak_cloudbreak_1 | 2017-12-05 18:01:07,913 [reactorDispatcher-21] accept:48 ERROR c.s.c.c.h.DownscaleStackCollectResourcesHandler - [owner:d312e73a-f6dc-4e83-9452-bde66b18791f] [type:CLOUDBREAKEVENTDATA] [id:1] [name:cbllapdev30] Failed to handle DownscaleStackCollectResourcesRequest. /cbreak_cloudbreak_1 | com.sequenceiq.cloudbreak.cloud.exception.CloudConnectorException: can't collect instance resources /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.azure.AzureResourceConnector.collectInstanceResourcesToRemove(AzureResourceConnector.java:293) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.azure.AzureResourceConnector.collectResourcesToRemove(AzureResourceConnector.java:241) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.azure.AzureResourceConnector.collectResourcesToRemove(AzureResourceConnector.java:50) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.handler.DownscaleStackCollectResourcesHandler.accept(DownscaleStackCollectResourcesHandler.java:43) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.handler.DownscaleStackCollectResourcesHandler.accept(DownscaleStackCollectResourcesHandler.java:19) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.handler.DownscaleStackCollectResourcesHandler$FastClassBySpringCGLIB$2b40b706.invoke(<generated>) /cbreak_cloudbreak_1 | at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:52) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) /cbreak_cloudbreak_1 | at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) /cbreak_cloudbreak_1 | at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673) /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.handler.DownscaleStackCollectResourcesHandler$EnhancerBySpringCGLIB$7e426fe2.accept(<generated>) /cbreak_cloudbreak_1 | at reactor.bus.EventBus$3.accept(EventBus.java:317) /cbreak_cloudbreak_1 | at reactor.bus.EventBus$3.accept(EventBus.java:310) /cbreak_cloudbreak_1 | at reactor.bus.routing.ConsumerFilteringRouter.route(ConsumerFilteringRouter.java:72) /cbreak_cloudbreak_1 | at reactor.bus.routing.TraceableDelegatingRouter.route(TraceableDelegatingRouter.java:51) /cbreak_cloudbreak_1 | at reactor.bus.EventBus.accept(EventBus.java:591) /cbreak_cloudbreak_1 | at reactor.bus.EventBus.accept(EventBus.java:63) /cbreak_cloudbreak_1 | at reactor.core.dispatch.AbstractLifecycleDispatcher.route(AbstractLifecycleDispatcher.java:160) /cbreak_cloudbreak_1 | at reactor.core.dispatch.MultiThreadDispatcher$MultiThreadTask.run(MultiThreadDispatcher.java:74) /cbreak_cloudbreak_1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) /cbreak_cloudbreak_1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) /cbreak_cloudbreak_1 | at java.lang.Thread.run(Thread.java:745) /cbreak_cloudbreak_1 | Caused by: java.lang.NullPointerException: null /cbreak_cloudbreak_1 | at com.sequenceiq.cloudbreak.cloud.azure.AzureResourceConnector.collectInstanceResourcesToRemove(AzureResourceConnector.java:282) /cbreak_cloudbreak_1 | ... 25 common frames omitted /cbreak_cloudbreak_1 | 2017-12-05 18:01:07,915 [reactorDispatcher-21] accept:52 INFO c.s.c.c.h.DownscaleStackCollectResourcesHandler - [owner:d312e73a-f6dc-4e83-9452-bde66b18791f] [type:CLOUDBREAKEVENTDATA] [id:1] [name:cbllapdev30] DownscaleStackCollectResourcesRequest finished /cbreak_cloudbreak_1 | 2017-12-05 18:01:07,915 [reactorDispatcher-21] accept:140 DEBUG c.s.c.c.f.Flow2Handler - [owner:d312e73a-f6dc-4e83-9452-bde66b18791f] [type:CLOUDBREAKEVENTDATA] [id:1] [name:cbllapdev30] flow control event arrived: key: DOWNSCALESTACKCOLLECTRESOURCESRESULT_ERROR, flowid: fd790366-8571-479d-98df-f8193447784c, payload: CloudPlatformResult{status=FAILED, statusReason='can't collect instance resources', errorDetails=com.sequenceiq.cloudbreak.cloud.exception.CloudConnectorException: can't collect instance resources, request=CloudStackRequest{, cloudStack=CloudStack{groups=[com.sequenceiq.cloudbreak.cloud.model.Group@e8cbf0d, com.sequenceiq.cloudbreak.cloud.model.Group@26928c18, com.sequenceiq.cloudbreak.cloud.model.Group@67e1ab4b], network=com.sequenceiq.cloudbreak.cloud.model.Network@206a2f0a, image=Image{imageName='https://sequenceiqwestus2.blob.core.windows.net/images/hdc-hdp--1706211640.vhd', userdata={CORE=#!/bin/bash

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

@Cibi Chakaravarthi

I suppose you are using managed disks, if so then this is a known issue which got fixed in Cloudbreak release 1.16.5.

You can try to update following the documentation or you might try to launch Cloudbreak 1.16.5 from Azure Marketplace.

Sorry for the inconvenience & I hope this helps!

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor
@pdarvasi

Thanks for the details! But, we are not using Managed disks in our VM instances.

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

@Cibi Chakaravarthi I suggest you to try with the new 1.16.5 version as it has this part of code refactored. The update should not affect your running clusters and it can be run with one command "cbd update".

Hope this helps!

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor

@pdarvasi Yes, i'm trying to test it out with a different cloudbreak deployment. Will let you know how it goes. Thanks for the help!

Highlighted

Re: Cloudbreak not terminating nodes after auto-scale down of cluster

New Contributor

Hi,

I'm trying to upscale the cluster from Cloudbreak and getting the below error:

update failed: New node(s) could not be added to the cluster. Reason com.sequenceiq.cloudbreak.service.CloudbreakServiceException: Ambari could not install services. Invalid Add Hosts Template: org.apache.ambari.server.topology.InvalidTopologyTemplateException: Must specify either host_name or host_count for hostgroup: worker

Error log from Ambari-server:

ERROR [ambari-client-thread-3771] BaseManagementHandler:67 - Bad request received: Invalid Add Hosts Template: org.apache.ambari.server.topology.InvalidTopologyTemplateException: Must specify either host_name or host_count for hostgroup: worker

I also could see the same error details from Ambari server log after enabling DEBUG Log mode.

Cloudbreak Version: 1.16.5

Ambari server version:- 2.6.0.0

Any insight on this issue?

Don't have an account?
Coming from Hortonworks? Activate your account here