Support Questions

Find answers, ask questions, and share your expertise

Who agreed with this topic

Bootstrap fails with "Insufficient number of instances available in time 20 MINUTES"

avatar
Explorer

The bootstrap fails with "Insufficient number of instances available in time 20 MINUTES" even though all the requested instances and their EBS volumes are provisioned. I'm running Director 2.2.

 

[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.l.p.c.PluggableComputeClusterTemplateValidator: Validating virtual instances of cluster Spark-DataScience
[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.l.p.c.PluggableComputeInstanceTemplateValidator: Validating instance template for compute provider: aws
[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'
[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'
[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'
[2017-07-26 13:42:12] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name cd-poc for fingerprint
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing AMI 'ami-08bf131e'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing subnet 'subnet-533c820a'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing security group 'sg-cdeabeb0'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing key pair
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.l.p.c.PluggableComputeInstanceTemplateValidator: Validating instance template for compute provider: aws
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name cd-poc for fingerprint
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing AMI 'ami-08bf131e'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing subnet 'subnet-533c820a'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing security group 'sg-cdeabeb0'
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing key pair
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.l.m.m.p.ClouderaManagerMetadata: No repository specified, using metadata for default Cloudera Manager version
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.l.b.v.GenericClusterTemplateValidator: No product version metadata available for CDH:5. Using current version metadata instead.
[2017-07-26 13:42:13] INFO  [qtp614855935-17] - c.c.l.p.DatabasePipelineService: Starting pipeline 'f2c751c1-48eb-443c-a9c3-e520cb9ce603' with root job com.cloudera.launchpad.api.jobs.DefaultBootstrapClus
terJob and listener com.cloudera.launchpad.api.listeners.pipeline.BootstrapClusterListener
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.d.ClusterRepositoryService: Cluster 'Spark-DataScience': BOOTSTRAPPING -> BOOTSTRAPPING
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: >> DefaultBootstrapClusterJob/4 [Environment{name='CapOne - Dev2 - CDH59 Environment', provider=InstanceProviderConfig
{type='aws'},  ...
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26294, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: >> SetStatusJob/1 [Requesting 7 instance(s) in 2 group(s)]
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.launchpad.pipeline.AbstractJob: Requesting 7 instance(s) in 2 group(s)
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: << None{}
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: >> ParallelForEachInBatches/4 [20, class com.cloudera.launchpad.bootstrap.AllocateInstances, [VirtualInstanceGroup{nam
e='masters', ...
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.p.u.ParallelForEachInBatches: Generating batch for job class com.cloudera.launchpad.bootstrap.AllocateInstances of size 2
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26299, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: >> UnboundedParallelForEach/3 [class com.cloudera.launchpad.bootstrap.AllocateInstances, [VirtualInstanceGroup{name='m
asters', vir ...
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.p.DatabasePipelineService: Starting pipeline 'f2c751c1-48eb-443c-a9c3-e520cb9ce603/child-00000-93e490b7-6e18-4981-a9c5-7ee8105e67cc' with root job c
om.cloudera.launchpad.bootstrap.AllocateInstances and listener com.cloudera.launchpad.pipeline.listener.NoopPipelineStageListener
[2017-07-26 13:42:14] INFO  [pipeline-thread-4] - c.c.l.p.DatabasePipelineService: Starting pipeline 'f2c751c1-48eb-443c-a9c3-e520cb9ce603/child-00000-42ce02b3-727b-4df5-a15d-3f99305788d5' with root job com.cloudera.launchpad.bootstrap.AllocateInstances and listener com.cloudera.launchpad.pipeline.listener.NoopPipelineStageListener
[2017-07-26 13:42:15] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26310, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-4] - c.c.l.pipeline.util.PipelineRunner: >> UnboundedWaitForAllPipelines/1 [[f2c751c1-48eb-443c-a9c3-e520cb9ce603/child-00000-93e490b7-6e18-4981-a9c5-7ee8105e67cc, f2c751c1-48 ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances/2 [VirtualInstanceGroup{name='masters', virtualInstances=[VirtualInstance{id='dbd5b101-667f-46db-956e- ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances/2 [VirtualInstanceGroup{name='workers', virtualInstances=[VirtualInstance{id='9a8baa82-4194-4867-9023- ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26319, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26320, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances$AllocateAndWaitForInstancesToRun/2 [VirtualInstanceGroup{name='workers', virtualInstances=[VirtualInstance{id='9a8baa82-4194-4867-9023- ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances$AllocateAndWaitForInstancesToRun/2 [VirtualInstanceGroup{name='masters', virtualInstances=[VirtualInstance{id='dbd5b101-667f-46db-956e- ...
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.l.bootstrap.AllocateInstances: Allocating 6 instances (min count 1) in group workers
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.l.bootstrap.AllocateInstances: Allocating 1 instances (min count 1) in group masters
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name cd-poc for fingerprint
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Requesting 1 instances for com.cloudera.director.aws.ec2.EC2InstanceTemplate@1a5d2935
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Building instance requests
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Network interface specification: {DeviceIndex: 0,SubnetId: subnet-533c820a,Groups: [sg-cdeabeb0],DeleteOnTermination: true,PrivateIpAddresses: [],AssociatePublicIpAddress: false}
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Original image block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-0c22e054999b5520f,VolumeSize: 50,DeleteOnTermination: true,VolumeType: gp2,Encrypted: false},}]
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-0c22e054999b5520f,VolumeSize: 75,DeleteOnTermination: true,VolumeType: gp2,},}]
[2017-07-26 13:42:15] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Instance request type: m4.large, image: ami-08bf131e, group size: 1
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name cd-poc for fingerprint
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Requesting 6 instances for com.cloudera.director.aws.ec2.EC2InstanceTemplate@2fc40cf8
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Building instance requests
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Network interface specification: {DeviceIndex: 0,SubnetId: subnet-533c820a,Groups: [sg-cdeabeb0],DeleteOnTermination: true,PrivateIpAddresses: [],AssociatePublicIpAddress: false}
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Original image block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-0c22e054999b5520f,VolumeSize: 50,DeleteOnTermination: true,VolumeType: gp2,Encrypted: false},}]
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: EBS volumes will be allocated as part of instance launch request
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Block device mappings: [{DeviceName: /dev/sda1,Ebs: {SnapshotId: snap-0c22e054999b5520f,VolumeSize: 50,DeleteOnTermination: true,VolumeType: gp2,},}, {DeviceName: /dev/sdf,Ebs: {VolumeSize: 1792,DeleteOnTermination: true,VolumeType: st1,Encrypted: false},}]
[2017-07-26 13:42:15] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Instance request type: m4.2xlarge, image: ami-08bf131e, group size: 6
[2017-07-26 13:42:16] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: << Reservation r-0519ae93f01066f5f with Instance{id=i-0e91d6d581c37c4b5 privateIp=10.16.113.60}
[2017-07-26 13:42:16] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: >> Tagging instance i-0e91d6d581c37c4b5 / dbd5b101-667f-46db-956e-0bd87431cbfa
[2017-07-26 13:42:16] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: << Reservation r-010b897d2842fb7c3 with Instance{id=i-00fc424904bfbab18 privateIp=10.16.113.157} Instance{id=i-051ee7afc61f6bee5 privateIp=10.16.113.79} Instance{id=i-065cdae4186725920 privateIp=10.16.113.167} Instance{id=i-02273a6e15782f7f3 privateIp=10.16.113.73} Instance{id=i-0db34bbdc9ecf8b7a privateIp=10.16.113.207} Instance{id=i-0fa73ede05ac38a26 privateIp=10.16.113.210}
[2017-07-26 13:42:16] INFO  [pipeline-thread-6] - c.c.director.aws.ec2.EC2Provider: >> Tagging instance i-00fc424904bfbab18 / 81b0bb5b-364d-4832-888e-bc1581b1ef68
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.director.aws.ec2.EC2Provider: << Instance i-0e91d6d581c37c4b5 got IP 10.16.113.60
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.bootstrap.AllocateInstances: Waiting for 0 instances to start running
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.p.c.PluggableComputeProvider: Waiting for 0 instances to be running
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: << DatabaseValue{delegate=PersistentValueEntity{id=26327, pipeline=f2c751c1-48eb-443c-a9c3-e520cb9ce603 ...
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: >> AllocateInstances$GetSuccessfulInstancesAndTerminateFailedInstances/4 [Environment{name='CapOne - Dev2 - CDH59 Environment', provider=InstanceProviderConfig{type='aws'},  ...
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.bootstrap.AllocateInstances: All requested instances failed.
[2017-07-26 13:42:31] INFO  [pipeline-thread-5] - c.c.l.bootstrap.AllocateInstances: Minimum number of instances (1) not available. Terminating available instances (0) as well.
[2017-07-26 13:42:31] ERROR [pipeline-thread-5] - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed
com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Insufficient number of instances available in time 20 MINUTES

<snip>

[2017-07-26 13:42:35] INFO  [pipeline-thread-6] - c.c.l.bootstrap.AllocateInstances: All requested instances are available
[2017-07-26 13:42:35] INFO  [pipeline-thread-6] - c.c.l.bootstrap.AllocateInstances: Sufficient number of instances available (6/6)
Who agreed with this topic