Reply
Explorer
Posts: 6
Registered: ‎09-12-2017
Accepted Solution

Cloudera Director fails to bootstrap Cloudera Manager AWS

[ Edited ]

 

 Hello!

 

I am new to Cloudera and I am trying to install Cloudera Director, Manager and CDH5 on AWS. I have the director running, I can log in using the SOCK5 proxy but with some errors:

 

1. I SSH into the instance.

2. I then run 

 

ssh -i mykey.pem -CND 8157 ec2-user@PUBLIC-INSTANCE-IP &

3. I then use SwitchyOmega on Chrome (replacing 172.31 for my VPC subnet prefix, this is a private prefix). I use my browser to access the following IP: http://PRIVATE-INSTANCE-IP:7189/

 

I can connect to the Cloudera Director and Set up my environment however,

 

I keep seing the below messages in my Bash CLI:

 

$ channel 2: open failed: connect failed: Connection refused
channel 3: open failed: connect failed: Connection refused
channel 4: open failed: connect failed: Connection refused
channel 5: open failed: connect failed: Connection refused
channel 6: open failed: connect failed: Connection refused
channel 7: open failed: connect failed: Connection refused

Have I misconfigured anything?

 

Cloudera Manager:

 

As soon as I try and Add the Clouder Manager I get the Bootstrap Fail:

 

I am using AMI:

64-bit
RHEL-7.3_HVM-20170424-x86_64-1-Hourly2-GP2 - ami-5f39a149
 
For both the Director and Manager. For the director I configured an AWS Role with the policy provided by Cloudera.
 

After Configuring the Manager and trying to run the  instance it says the bootstrap failed.

When I Collect Diagnostics Data:

Status: 412, Reason: Deployment is not available for /RicohTestCluster/Manager

 

Below are the Logs:

Cloudera log

https://we.tl/x4J3H5bJEm

 

Suppressed: com.cloudera.launchpad.pluggable.common.ExceptionConditions$DetailHolderException: Exception details:

Caused by: com.cloudera.director.spi.v1.model.exception.UnrecoverableProviderException: Unexpected problem during instance allocation
	at com.cloudera.director.aws.ec2.EC2Provider.allocateOnDemandInstances(EC2Provider.java:1237)
	at com.cloudera.director.aws.ec2.EC2Provider.allocate(EC2Provider.java:661)
	at com.cloudera.director.aws.ec2.EC2Provider.allocate(EC2Provider.java:1)
	at com.cloudera.launchpad.pluggable.compute.PluggableComputeProvider.allocate(PluggableComputeProvider.java:572)
	... 36 common frames omitted
Caused by: com.cloudera.director.spi.v1.model.exception.UnrecoverableProviderException: Allocated 0 instances when the minimum count is 1.
	at com.cloudera.director.aws.ec2.EC2Provider.allocateOnDemandInstances(EC2Provider.java:1196)
	... 39 common frames omitted
[2017-09-15 17:09:54.182 -0400] ERROR [p-53adcbaa5e08-DefaultBootstrapDeploymentJob] 3b0e9189-6e4b-46be-a711-fcddd719e486 POST /api/v9/environments/RicohTestCluster/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed
com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Insufficient number of instances available after allocation
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:239)
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:191)
	at com.cloudera.launchpad.pipeline.job.Job2.runUnchecked(Job2.java:31)
	at com.cloudera.launchpad.pipeline.job.Job2$$FastClassBySpringCGLIB$$54178502.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
	at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
	at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:60)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
	at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun$$EnhancerBySpringCGLIB$$fcfc6faf.runUnchecked(<generated>)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:197)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:168)
	at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
	at com.github.rholder.retry.Retryer.call(Retryer.java:160)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:133)
	at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:164)
	at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-15 17:09:54.185 -0400] ERROR [p-53adcbaa5e08-DefaultBootstrapDeploymentJob] 3b0e9189-6e4b-46be-a711-fcddd719e486 POST /api/v9/environments/RicohTestCluster/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.p.DatabasePipelineRunner: Encountered an unrecoverable error ErrorInfo{code=INSTANCE_ALLOCATION_FAILURE, properties={minRequiredCount=1, numberOfInstancesAllocated=0, virtualInstanceGroup=CM}, causes=[]} in job com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun
com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Insufficient number of instances available after allocation
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:239)
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:191)
	at com.cloudera.launchpad.pipeline.job.Job2.runUnchecked(Job2.java:31)
	at com.cloudera.launchpad.pipeline.job.Job2$$FastClassBySpringCGLIB$$54178502.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
	at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
	at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:60)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
	at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun$$EnhancerBySpringCGLIB$$fcfc6faf.runUnchecked(<generated>)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:197)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:168)
	at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
	at com.github.rholder.retry.Retryer.call(Retryer.java:160)
	at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:133)
	at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:164)
	at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2017-09-15 17:09:54.186 -0400] ERROR [p-53adcbaa5e08-DefaultBootstrapDeploymentJob] 3b0e9189-6e4b-46be-a711-fcddd719e486 POST /api/v9/environments/RicohTestCluster/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.p.DatabasePipelineRunner: Pipeline '4ddcf904-0d85-4019-a28f-53adcbaa5e08' failed
	at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun$$EnhancerBySpringCGLIB$$fcfc6faf
	at com.cloudera.launchpad.bootstrap.AllocateInstances:0

[2017-09-15 17:09:54.199 -0400] INFO  [p-53adcbaa5e08-DefaultBootstrapDeploymentJob] 3b0e9189-6e4b-46be-a711-fcddd719e486 POST /api/v9/environments/RicohTestCluster/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.p.s.PipelineRepositoryService: Pipeline '4ddcf904-0d85-4019-a28f-53adcbaa5e08': RUNNING -> ERROR
[2017-09-15 17:09:54.220 -0400] INFO  [p-53adcbaa5e08-DefaultBootstrapDeploymentJob] 3b0e9189-6e4b-46be-a711-fcddd719e486 POST /api/v9/environments/RicohTestCluster/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.d.DeploymentRepositoryService: Deployment 'RicohManager': BOOTSTRAPPING -> BOOTSTRAP_FAILED
[2017-09-15 17:11:13.533 -0400] INFO  [task-thread-4] - - - - - c.c.launchpad.task.RefreshClusters: Refreshing Cluster models
[2017-09-15 17:11:13.534 -0400] INFO  [task-thread-4] - - - - - c.c.launchpad.task.RefreshClusters: Finished refreshing all pre-existing Cluster models
[2017-09-15 17:11:13.985 -0400] INFO  [task-thread-6] - - - - - c.c.l.task.RefreshDeployments: Refreshing pre-existing Deployments
[2017-09-15 17:11:13.985 -0400] INFO  [task-thread-6] - - - - - c.c.l.task.RefreshDeployments: Skipping refresh of deployment RicohTestCluster:RicohManager as it is in transition or terminated. (Stage: BOOTSTRAP_FAILED, deploymentIsNull: true)
[2017-09-15 17:11:13.986 -0400] INFO  [task-thread-6] - - - - - c.c.l.task.RefreshDeployments: Finished refreshing all pre-existing Deployment models
[2017-09-15 17:11:22.445 -0400] DEBUG [qtp1541942595-14] 4c02c291-b86c-4c7d-9a6c-78f6808d892e POST /api/v9/environments/RicohTestCluster/deployments/RicohManager/diagnosticData - - o.s.w.s.m.m.a.ExceptionHandlerExceptionResolver: Resolving exception from handler [public void com.cloudera.launchpad.api.v9.DeploymentsResourceV9.collectDiagnosticData(java.lang.String,java.lang.String) throws java.lang.InterruptedException]: com.cloudera.launchpad.api.common.DeploymentsResource$DeploymentUpdatePreconditionFailedException: Deployment is not available for /RicohTestCluster/RicohManager
[2017-09-15 17:11:22.447 -0400] DEBUG [qtp1541942595-14] 4c02c291-b86c-4c7d-9a6c-78f6808d892e POST /api/v9/environments/RicohTestCluster/deployments/RicohManager/diagnosticData - - o.s.w.s.m.m.a.ExceptionHandlerExceptionResolver: Invoking @ExceptionHandler method: public com.cloudera.launchpad.api.common.DeploymentsResource$DeploymentUpdatePreconditionFailedException com.cloudera.launchpad.api.common.DeploymentsResource$ExceptionAdvice.handleDeploymentUpdatePreconditionFailedException(com.cloudera.launchpad.api.common.DeploymentsResource$DeploymentUpdatePreconditionFailedException)
[2017-09-15 17:11:56.025 -0400] INFO  [qtp1541942595-18] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.d.DeploymentRepositoryService: Deployment 'RicohManager': BOOTSTRAP_FAILED -> TERMINATING
[2017-09-15 17:11:56.033 -0400] INFO  [qtp1541942595-18] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.p.DatabasePipelineService: Starting pipeline '35571387-e847-462d-b1f5-c85e0bfbbddb' with root job com.cloudera.launchpad.api.jobs.DefaultTerminateDeploymentJob and listener com.cloudera.launchpad.api.listeners.pipeline.TerminateDeploymentListener
[2017-09-15 17:11:56.058 -0400] INFO  [qtp1541942595-18] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.p.DatabasePipelineService: Create new runner thread for pipeline '35571387-e847-462d-b1f5-c85e0bfbbddb'
[2017-09-15 17:11:56.081 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.d.DeploymentRepositoryService: Deployment 'RicohManager': TERMINATING -> TERMINATING
[2017-09-15 17:11:56.161 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.api.jobs.DefaultTerminateDeploymentJob - c.c.l.pipeline.util.PipelineRunner: >> DefaultTerminateDeploymentJob/6 [Environment{name='RicohTestCluster', provider=InstanceProviderConfig{type='aws'}, credentials=SshCr ...
[2017-09-15 17:11:56.195 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.api.jobs.DefaultTerminateDeploymentJob - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.243 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.api.jobs.CancelPipelineAndWaitJob - c.c.l.pipeline.util.PipelineRunner: >> CancelPipelineAndWaitJob/1 [Optional.absent()]
[2017-09-15 17:11:56.243 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.api.jobs.CancelPipelineAndWaitJob - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.255 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.l.pipeline.util.PipelineRunner: >> SetStatusJob/1 [Terminating Cloudera Manager instance]
[2017-09-15 17:11:56.258 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.launchpad.pipeline.AbstractJob: Terminating Cloudera Manager instance
[2017-09-15 17:11:56.258 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.314 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.TerminateInstances - c.c.l.pipeline.util.PipelineRunner: >> TerminateInstances/2 [Environment{name='RicohTestCluster', provider=InstanceProviderConfig{type='aws'}, credentials=SshCr ...
[2017-09-15 17:11:56.384 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.TerminateInstances - c.c.director.aws.ec2.EC2Provider: Found EC2 key name ClouderaEDH for fingerprint
[2017-09-15 17:11:56.399 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.TerminateInstances - c.c.director.aws.ec2.EC2Provider: Unable to terminate instances, all unknown [f21dcce3-41ee-4178-8d00-15a5350d77de]
[2017-09-15 17:11:56.399 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.TerminateInstances - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.409 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.l.pipeline.util.PipelineRunner: >> SetStatusJob/1 [Waiting for instance to be terminated]
[2017-09-15 17:11:56.412 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.launchpad.pipeline.AbstractJob: Waiting for instance to be terminated
[2017-09-15 17:11:56.412 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.pipeline.SetStatusJob - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.468 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.pipeline.util.PipelineRunner: >> WaitForInstancesTermination/2 [Environment{name='RicohTestCluster', provider=InstanceProviderConfig{type='aws'}, credentials=SshCr ...
[2017-09-15 17:11:56.468 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.p.c.PluggableComputeProvider: Waiting for instances to be terminated.
[2017-09-15 17:11:56.527 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.director.aws.ec2.EC2Provider: Found EC2 key name ClouderaEDH for fingerprint
[2017-09-15 17:11:56.550 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.director.aws.ec2.EC2Provider: >> Fetching page 0
[2017-09-15 17:11:56.551 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.p.c.PluggableComputeProvider: Instance: f21dcce3-41ee-4178-8d00-15a5350d77de has desired status: UNKNOWN
[2017-09-15 17:11:56.551 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.p.c.PluggableComputeProvider: Instance: f21dcce3-41ee-4178-8d00-15a5350d77de has status: UNKNOWN
[2017-09-15 17:11:56.551 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager com.cloudera.launchpad.cleanup.WaitForInstancesTermination - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-09-15 17:11:56.559 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.p.s.PipelineRepositoryService: Pipeline '35571387-e847-462d-b1f5-c85e0bfbbddb': RUNNING -> COMPLETED
[2017-09-15 17:11:56.570 -0400] INFO  [p-c85e0bfbbddb-DefaultTerminateDeploymentJob] 74a94fea-3335-4990-9731-061e8c5c1d10 DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.d.DeploymentRepositoryService: Deployment 'RicohManager': TERMINATING -> TERMINATED
[2017-09-15 17:11:59.081 -0400] INFO  [qtp1541942595-17] 32dec189-5de7-41ef-9da6-51519ad67c2a DELETE /api/v9/environments/RicohTestCluster/deployments/RicohManager - - c.c.l.api.common.DeploymentsResource: Nothing to do on deployment. Current deployment status is: Status{stage=TERMINATED, description='Done', descriptionDetails=[], errorInfo=Optional.absent(), remainingSteps=0, completedSteps=6, health=Health{status=NOT_AVAILABLE}, diagnosticDataSummaries=[]}
[2017-09-15 17:12:13.392 -0400] INFO  [task-thread-10] - - - - - c.c.l.m.r.DeploymentsReporter: Enqueueing all deployments for usage reporting
[2017-09-15 17:12:13.394 -0400] INFO  [task-thread-10] - - - - - c.c.l.m.r.DeploymentsReporter: Enqueueing 0 deployments for usage reporting
[2017-09-15 17:16:13.535 -0400] INFO  [task-thread-10] - - - - - c.c.launchpad.task.RefreshClusters: Refreshing Cluster models
[2017-09-15 17:16:13.535 -0400] INFO  [task-thread-10] - - - - - c.c.launchpad.task.RefreshClusters: Finished refreshing all pre-existing Cluster models
[2017-09-15 17:16:13.988 -0400] INFO  [task-thread-1] - - - - - c.c.l.task.RefreshDeployments: Refreshing pre-existing Deployments
[2017-09-15 17:16:13.989 -0400] INFO  [task-thread-1] - - - - - c.c.l.task.RefreshDeployments: Skipping refresh of deployment RicohTestCluster:RicohManager as it is in transition or terminated. (Stage: TERMINATED, deploymentIsNull: true)
[2017-09-15 17:16:13.989 -0400] INFO  [task-thread-1] - - - - - c.c.l.task.RefreshDeployments: Finished refreshing all pre-existing Deployment models
[2017-09-15 17:17:13.394 -0400] INFO  [task-thread-8] - - - - - c.c.l.m.r.DeploymentsReporter: Enqueueing all deployments for usage reporting
[2017-09-15 17:17:13.396 -0400] INFO  [task-thread-8] - - - - - c.c.l.m.r.DeploymentsReporter: Enqueueing 0 deployments for usage reporting

 

Please help! I´d like to get my cluster up and running!

thanks 

Cloudera Employee
Posts: 49
Registered: ‎10-28-2014

Re: Cloudera Director fails to bootstrap Cloudera Manager AWS

I'm sorry you're having trouble getting your cluster running. You are trying to create the cluster through the Cloudera Director UI, right? If so (or if you're using the bootstrap-remote command from the command line), you should look in Cloudera Director server's application.log for an earlier exception giving the specific failure during instance allocation.

 

Explorer
Posts: 6
Registered: ‎09-12-2017

Re: Cloudera Director fails to bootstrap Cloudera Manager AWS

[ Edited ]

Thanks Jadair for your help.I found the below exception:

The only thing I see is the tag key ¨Name¨ being duplicated, could that be it? 

 

 

[2017-10-04 16:55:45.707 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances/2 [VirtualInstanceGroup{name='CM', virtualInstances=[VirtualInstance{id='7e44695f-c37c-49b0-a132-3b6b7 ...
[2017-10-04 16:55:45.812 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=526, pipeline=00720bdb-c415-4354-a272-532ade9d861a, ...
[2017-10-04 16:55:45.915 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances$AllocateAndWaitForInstancesToRun/2 [VirtualInstanceGroup{name='CM', virtualInstances=[VirtualInstance{id='7e44695f-c37c-49b0-a132-3b6b7 ...
[2017-10-04 16:55:45.915 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.bootstrap.AllocateInstances: Allocating 1 instances (min count 1) in group CM
[2017-10-04 16:55:45.977 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: Found EC2 key name ClouderaEDH for fingerprint
[2017-10-04 16:55:45.977 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Requesting 1 instances for com.cloudera.director.aws.ec2.EC2InstanceTemplate@2d46916d
[2017-10-04 16:55:46.022 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Building 1 instance requests
[2017-10-04 16:55:46.025 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Network interface specification: {DeviceIndex: 0,SubnetId: subnet-6712fd2c,Groups: [sg-b18d4cc2],DeleteOnTermination: true,PrivateIpAddresses: [],AssociatePublicIpAddress: true,Ipv6Addresses: [],}
[2017-10-04 16:55:46.048 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Original image block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-02a781e1733af35eb,VolumeSize: 10,DeleteOnTermination: true,VolumeType: gp2,Encrypted: false},}]
[2017-10-04 16:55:46.054 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-02a781e1733af35eb,VolumeSize: 100,DeleteOnTermination: true,VolumeType: gp2,},}]
[2017-10-04 16:55:46.054 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Instance request type: m4.xlarge, image: ami-5f39a149
[2017-10-04 16:55:46.061 -0400] INFO [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: >> Submitted 1 run instance requests.
[2017-10-04 16:55:46.289 -0400] ERROR [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.director.aws.ec2.EC2Provider: Exception while trying to allocate instance.
java.util.concurrent.ExecutionException: com.cloudera.director.aws.shaded.com.amazonaws.services.ec2.model.AmazonEC2Exception: Duplicate tag key 'Name' specified. (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameterValue; Request ID: 849a90ee-d9e0-4ded-b2e6-95ae95694bd2)


at java.lang.Thread.run(Thread.java:748) Suppressed: com.cloudera.launchpad.pluggable.common.ExceptionConditions$DetailHolderException: Exception details: Caused by: com.cloudera.director.spi.v1.model.exception.UnrecoverableProviderException: Unexpected problem during instance allocation at com.cloudera.director.aws.ec2.EC2Provider.allocateOnDemandInstances(EC2Provider.java:1237) at com.cloudera.director.aws.ec2.EC2Provider.allocate(EC2Provider.java:661) at com.cloudera.director.aws.ec2.EC2Provider.allocate(EC2Provider.java:1) at com.cloudera.launchpad.pluggable.compute.PluggableComputeProvider.allocate(PluggableComputeProvider.java:572) ... 36 common frames omitted Caused by: com.cloudera.director.spi.v1.model.exception.UnrecoverableProviderException: Allocated 0 instances when the minimum count is 1. at com.cloudera.director.aws.ec2.EC2Provider.allocateOnDemandInstances(EC2Provider.java:1196) ... 39 common frames omitted [2017-10-04 16:55:46.338 -0400] ERROR [p-532ade9d861a-DefaultBootstrapDeploymentJob] 5f36dff1-458a-4aa3-b77d-81093a318f3f POST /api/v9/environments/QA-DevEnvironment/deployments com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Insufficient number of instances available after allocation at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:239) at com.cloudera.launchpad.bootstrap.AllocateInstances$AllocateAndWaitForInstancesToRun.run(AllocateInstances.java:191) at com.cloudera.launchpad.pipeline.job.Job2.runUnchecked(Job2.java:31)

 

Cloudera Employee
Posts: 49
Registered: ‎10-28-2014

Re: Cloudera Director fails to bootstrap Cloudera Manager AWS

Yes. Are you attempting to specify a custom Name tag in your instance template? Cloudera Director by default uses that tag to attach the name that it generates for the instance. If it is not critical for you to provide a value for the Name tag, you should remove it from your configuration. If it is critical, it is possible to configure Director to use custom tag names. See the "Configuring Cloudera Director to Use Custom Tag Names on AWS" section of the Cloudera Director User Guide. There was a bug around this functionality in Director 2.5.0, but it should work correctly in Director 2.5.1 and later.

Highlighted
Explorer
Posts: 6
Registered: ‎09-12-2017

Re: Cloudera Director fails to bootstrap Cloudera Manager AWS

Yes I was. I deleted the Name Tag and the bootstrap kept failing, the log showed CD was trying to connect constantly to the Manager so I checked Security Group config and allowed all trafic from within the Security group.

 

They are both configured in the same VPC, Subnet and under the same SG.

 

Thanks for your help!

Announcements