Support Questions

Find answers, ask questions, and share your expertise

Director 1.5 Unavailable Instances available at step at Requesting an Insance for Cloudera Manager

avatar
Contributor

I am running Director version 1.5 on an Amazon instance that has both client and server on it.

 

When I run bootstrap-remote with my config file, a number of steps complete with messages.

 

It gets stuck right after it posts the essage 'requesting insance for CM'.

 

I can see the instances being built in AWS. The AWS completes checkout on the new machines and I can log into them.

 

However director waits on that step and eventually fails (see below).

 

Director screen output   ----

 

Process logs can be found at /home/ec2-user/.cloudera-director/logs/application.log
Plugins will be loaded from /var/lib/cloudera-director-client/plugins
Cloudera Director 1.5.0 initializing ...
Logging in as admin ...
Logged in successfully.
Configuration file passes all validation checks.
Creating a new environment...
Creating external database servers if configured...
Creating a new Cloudera Manager...
Creating a new CDH cluster...
* Requesting an instance for Cloudera Manager ..................................                                                                                        ................................................................................                  

................................................................................  

* Insufficient number of instances available in time 20 MINUTES ...
Logging out....

 

The application.log has no messages that jump out as errors.

 

Any ideas where I should look?

 

-  rd

1 ACCEPTED SOLUTION

avatar
Expert Contributor

If you look in the log, does it seem that Director is trying and failing to SSH into the newly provisioned instances? If so, you probably have a problem with your network or security group configuration. Can you SSH into one of the instances manually from the instance where you are running Director?

 

View solution in original post

7 REPLIES 7

avatar
Expert Contributor

If you look in the log, does it seem that Director is trying and failing to SSH into the newly provisioned instances? If so, you probably have a problem with your network or security group configuration. Can you SSH into one of the instances manually from the instance where you are running Director?

 

avatar
Contributor

Jadair,

 

Thanks for your help. Although I had a few problems along the way - many of them reverted back to my need to fix my security group.

I appreciate our help.

 

-  rd

avatar
Contributor


<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="Generator" content="Microsoft Exchange Server" />
<!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } -->div.PlainText { font-size:120%; font-family:monospace; }P {margin-top:0;margin-bottom:0;}



I found an application log that was in var/log/cloudera-director-server that had a lot more information. It confirms your suspicion.



It shows the servers be allocated and started:



[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.director.aws.ec2.EC2Provider: << Result: {InstanceStatuses: [{InstanceId: i-31663ad8,AvailabilityZone: us-east-1e$

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 84898b6d-4a27-4d76-8fc6-e6ca99741449 has desired status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 783a9731-ef93-4079-a293-75e0d6aa7883 has desired status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 4082410d-249d-481c-b20e-0140c45c8f2e has desired status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 1eb5ddf4-a182-4908-a8c5-ffd26359aa81 has desired status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 2dca1909-7208-464d-8f7b-3400cc4d5468 has desired status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 84898b6d-4a27-4d76-8fc6-e6ca99741449 has status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 783a9731-ef93-4079-a293-75e0d6aa7883 has status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 4082410d-249d-481c-b20e-0140c45c8f2e has status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 1eb5ddf4-a182-4908-a8c5-ffd26359aa81 has status: RUNNING

[2015-10-08 18:45:49] INFO  [pipeline-thread-3] - c.c.l.p.c.PluggableComputeProvider: Instance: 2dca1909-7208-464d-8f7b-3400cc4d5468 has status: RUNNING



Then I get a ton of these prior to it failing:



[2015-10-08 18:45:51] INFO  WaitForSrverOnPortUntilTime/4 Attempting Connection to /10.0.0.[110,108, 112, 228, 111, 109]:22.



It just donned on my that in the security group I limited the port 22 access to my IP address. I am guessing that would prevent director from doing its work?



Is that a reasonable guess?



-  rd



avatar
Expert Contributor

By "my IP address", I assume you mean your machine outside the cloud, rather than the AWS instance running Director. If so, then yes, that is almost certainly the problem. Director does its setup work on the Cloudera Manager and cluster instances via SSH, so you should ensure that your security groups allow that access.

 

avatar
Contributor


<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="Generator" content="Microsoft Exchange Server" />
<!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } -->div.PlainText { font-size:120%; font-family:monospace; }P {margin-top:0;margin-bottom:0;}



jadair,



I made the change and I am well past that point (thanks!).



I am getting this error now. It references a log that appears to be an absolute address, but the directory doesn't exist.



This does not appear in the application.log, which appears to be relatively clean.





[ec2-user@ip-10-0-0-76 ~]$ Unexpected internal error (see logs): {"timestamp":1444337043467,"status":500,"error":"Internal Server Error","exception":"com.cloudera.launchpad.api.InternalProcessFailedException","message":"Server Error","path":"/api/v3/environments/Eastern-BD-Cluster%20Environment/deployments/Eastern-BD-Cluster%20Deployment/clusters"}



It appears after these messages:



Process logs can be found at /home/ec2-user/.cloudera-director/logs/application.log

Plugins will be loaded from /var/lib/cloudera-director-client/plugins

Cloudera Director 1.5.0 initializing ...

Logging in as admin ...

Logged in successfully.

Configuration file passes all validation checks.

Creating a new environment...

Creating external database servers if configured...

Creating a new Cloudera Manager...

Creating a new CDH cluster...

Logging out...



-  rd



avatar
Expert Contributor

Hmmm. That REST API call is meant to be listing the clusters under your "Eastern-BD-Cluster Deployment" deployment in your "Eastern-BD-Cluster Environment" environment. When you say that the log is clean, are you referring to the client log or server log? I am surprised to see a 500 error with no corresponding information in the server application log in var/log/cloudera-director-server.

 

avatar
Contributor


<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="Generator" content="Microsoft Exchange Server" />
<!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } -->div.PlainText { font-size:120%; font-family:monospace; }P {margin-top:0;margin-bottom:0;}



This is all that is in the /var/log/cloudera-director-server/application.log for that time period.



[2015-10-08 20:43:51] INFO  [qtp541471631-24] - c.c.l.a.c.AuthenticationResource: Logging in admin via API

[2015-10-08 20:43:52] INFO  [qtp541471631-24] - o.s.b.a.audit.listener.AuditListener: AuditEvent [timestamp=Thu Oct 08 20:43:52 UTC 2015, principal=admin, type=AUTHENT$

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.l.b.v.GenericEnvironmentValidator: Validating environment Eastern-BD-Cluster Environment

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.l.m.PrivateKeySshCredentialsValidator: Validating SSH credentials for ec2-user

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.l.p.c.PluggableComputeEnvironmentValidator: Validating environment for compute provider: aws

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-26] - c.c.l.p.d.PluggableDatabaseServerEnvironmentValidator: Validating environment for database server provider: aws

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating deployment template: Eastern-BD-Cluster Deployment

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating repository URLs

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating external databases and templates

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.l.p.c.PluggableComputeDeploymentTemplateValidator: Validating Cloudera Manager virtual instance template

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.l.p.c.PluggableComputeInstanceTemplateValidator: Validating instance template for compute provider: aws

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'us-east-1'

[2015-10-08 20:44:02] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.us-east-1.amazonaws.com' for region 'us-east-1'

[2015-10-08 20:44:03] INFO  [qtp541471631-27] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name easternct-big-data-keypair for fingerprint

[2015-10-08 20:44:03] INFO  [qtp541471631-27] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing AMI 'ami-8767d1ec'

[2015-10-08 20:44:03] INFO  [qtp541471631-27] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing subnet 'subnet-de1fcbe3'

[2015-10-08 20:44:03] INFO  [qtp541471631-27] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing security group 'sg-d68676b0'

[2015-10-08 20:44:03] INFO  [qtp541471631-27] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing key pair



I think I am going to restart the server and rerun. Perhaps there is something leftover form the earlier problems.





rd