Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cloudera Director cloudera-director bootstrap-remote command fails with "Suspended due to failure"

avatar
Rising Star

Hi,

 

Just wondering if anyone can help....

 

I have a Cloudera Director 2 AWS EC2 instance running (AWS EC2 "c4.xlarge") on CentOS 6.7.

 

I have created a new config file on this Director instance (/usr/lib64/cloudera-director/client/client1_dev_cdh_cluster.aws.cluster.conf).

 

The config file references a single AWS VPC Subnet and Security Group (with specific inbound/outbound rules defined in the Director 2.0 User Guide).

 

The config file uses my own AWS "AMI's" (built around a CentOS 6.7 HVM "r3.xlarge" for the Cloudera Manager instance and CentOS 6.7 HVM "d2.2xlarge" for specific Master instances).

 

I'm using my own AMI's to get around the known bug that prevents re-sizing of CentOS /root partitions (my AMI's have 50GB /root partitions).

 

The config file does not use any external databases for anything (just the normal H2 database and local PostgreSQL databases for the Cloudera amon/rman/nav/navms/hue/hive metastore etc etc).

 

NOTE:  I can use the same AWS VPC/Subnets/Security Groups and use the Cloudera Director Server GUI to create the Director instance and the Cloudera Manager instance using the same AWS EC2 "AMI's" I've created without an issue.

 

When I try and create a new cluster using the following "bootstrap-remote" command using the Cloudera Director Client, it fails.

 

This is my first real "play with" the Cloudera Director Client and "bootstrap-remote" command....so it could be an "issue" with my config file (but I can't tell).

 

The TCPIP number listed below is an AWS Private IP number (I have an ElasticIP number associated with the Director instance, but for obvious reasons I'mnot going to display that in this community blog).

 

[root@]#  cloudera-director bootstrap-remote client1_dev_cdh_cluster.aws.cluster.conf --lp.remote.username=admin --lp.remote.password={obfuscated} --lp.remote.hostAndPort=10.0.1.247:7189

 

Process logs can be found at /root/.cloudera-director/logs/application.log

Plugins will be loaded from /var/lib/cloudera-director-plugins

Cloudera Director 2.0.0 initializing ...

Connecting to http://10.0.1.247:7189

Current user roles: [ROLE_ADMIN, ROLE_READONLY]

Configuration file passes all validation checks.

Creating a new environment...

Creating external database servers if configured...

Creating a new Cloudera Manager...

Creating a new CDH cluster...

* Requesting an instance for Cloudera Manager ... done

* Suspended due to failure ...

 

 

When I review the log file, it shows the following....

 

I was wondering where I might look to find additional errors ?

 

No Cloudera Manager instance is created (so I cannot look in the Cloudera Manager log).

 

 

[root@]# cd /root/.cloudera-director/logs

 

[root@]# cat application.log

 

[2016-04-19 14:27:05] INFO  [main] - o.f.core.internal.command.DbValidate: Validated 11 migrations (execution time 00:00.021s)

[2016-04-19 14:27:05] INFO  [main] - o.f.core.internal.command.DbMigrate: Current version of schema "PUBLIC": 3.2.0.0.3

[2016-04-19 14:27:05] INFO  [main] - o.f.core.internal.command.DbMigrate: Schema "PUBLIC" is up to date. No migration necessary.

[2016-04-19 14:27:06] INFO  [main] - o.s.o.j.LocalContainerEntityManagerFactoryBean: Building JPA container EntityManagerFactory for persistence unit 'default'

[2016-04-19 14:27:06] INFO  [main] - o.h.jpa.internal.util.LogHelper: HHH000204: Processing PersistenceUnitInfo [

name: default

...]

[2016-04-19 14:27:06] INFO  [main] - org.hibernate.Version: HHH000412: Hibernate Core {4.3.10.Final}

[2016-04-19 14:27:06] INFO  [main] - org.hibernate.cfg.Environment: HHH000206: hibernate.properties not found

[2016-04-19 14:27:06] INFO  [main] - org.hibernate.cfg.Environment: HHH000021: Bytecode provider name : javassist

[2016-04-19 14:27:06] INFO  [main] - o.h.annotations.common.Version: HCANN000001: Hibernate Commons Annotations {4.0.5.Final}

[2016-04-19 14:27:06] INFO  [main] - org.hibernate.dialect.Dialect: HHH000400: Using dialect: org.hibernate.dialect.H2Dialect

[2016-04-19 14:27:06] INFO  [main] - o.h.h.i.a.ASTQueryTranslatorFactory: HHH000397: Using ASTQueryTranslatorFactory

[2016-04-19 14:27:07] INFO  [main] - c.c.l.c.RemoteCommandProperties: Overriding hostAndPort=10.0.1.247:7189 (default localhost:7189)

[2016-04-19 14:27:07] INFO  [main] - c.c.l.c.RemoteCommandProperties: Overriding default password

[2016-04-19 14:27:07] INFO  [main] - c.c.l.c.RemoteCommandProperties: Overriding username=admin (default null)

[2016-04-19 14:27:08] INFO  [main] - c.c.l.p.c.PluggableProviderConfig: Overriding blacklist=[byon, sandbox] (default [])

[2016-04-19 14:27:08] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Looking for providers in JAR file /var/lib/cloudera-director-plugins/aws-provider-1.1.0/aws-provider-1.1.0.jar

[2016-04-19 14:27:08] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Loaded launcher com.cloudera.director.aws.AWSLauncher

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Looking for providers in JAR file /var/lib/cloudera-director-plugins/sandbox-provider-1.1.0/sandbox-provider-1.1.0.jar

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Loaded launcher com.cloudera.director.sandbox.SandboxLauncher

[2016-04-19 14:27:09] WARN  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Cannot read configuration file: /var/lib/cloudera-director-plugins/sandbox-provider-1.1.0/etc/labels.conf

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Not loading blacklisted provider sandbox.

[2016-04-19 14:27:09] WARN  [main] - c.c.l.p.c.PluggableCloudProviderFactory: No providers registered from JAR sandbox-provider-1.1.0.jar

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: No plugin JAR found in plugin directory /var/lib/cloudera-director-plugins/sandbox-provider-1.1.0

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Looking for providers in JAR file /var/lib/cloudera-director-plugins/google-provider-1.0.1/google-provider-1.0.1.jar

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Loaded launcher com.cloudera.director.google.GoogleLauncher

[2016-04-19 14:27:09] WARN  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Cannot read configuration file: /var/lib/cloudera-director-plugins/google-provider-1.0.1/etc/labels.conf

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Looking for providers in JAR file /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0/byon-provider-example-1.0.0.jar

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Loaded launcher com.cloudera.director.byon.BYONLauncher

[2016-04-19 14:27:09] WARN  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Cannot read configuration file: /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0/etc/labels.conf

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Not loading blacklisted provider byon.

[2016-04-19 14:27:09] WARN  [main] - c.c.l.p.c.PluggableCloudProviderFactory: No providers registered from JAR byon-provider-example-1.0.0.jar

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: No plugin JAR found in plugin directory /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0

[2016-04-19 14:27:09] INFO  [main] - c.c.l.p.c.PluggableCloudProviderFactory: Skipping file /var/lib/cloudera-director-plugins/README.md

[2016-04-19 14:27:09] INFO  [main] - c.c.l.parcel.ParcelInfoExtractor: Overriding userAgentQualifier=${lp.cloudera.manager.configuration.creatorTagQualifier} (default Optional.absent())

[2016-04-19 14:27:09] INFO  [main] - c.c.l.parcel.ParcelInfoExtractor: Parcel manifest extractor will use 'Cloudera Director/2.0.0 (${lp.cloudera.manager.configuration.creatorTagQualifier})' as the User-Agent

[2016-04-19 14:27:09] INFO  [main] - c.c.l.config.PipelineServiceConfig: Overriding minWaitBetweenAttempts=10 (default 10)

[2016-04-19 14:27:09] INFO  [main] - c.c.l.config.PipelineServiceConfig: Overriding maxWaitBetweenAttempts=60 (default -1)

[2016-04-19 14:27:09] INFO  [main] - c.c.l.config.PipelineServiceConfig: Overriding maxNumberOfAttempts=-1 (default -1)

[2016-04-19 14:27:09] INFO  [main] - c.c.l.config.PipelineServiceConfig: Pipeline retry behavior: minWaitBetweenAttempts=10, maxNumberOfAttempts=-1, maxWaitBetweenAttempts=60

[2016-04-19 14:27:10] INFO  [main] - c.c.launchpad.config.MetricsConfig: Metrics reporting is disabled.

[2016-04-19 14:27:13] INFO  [main] - c.c.l.p.c.PluggableComputeEnvironmentValidator: Validating environment for compute provider: aws

[2016-04-19 14:27:14] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:16] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:17] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:18] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:18] INFO  [main] - c.c.l.p.d.PluggableDatabaseServerEnvironmentValidator: Validating environment for database server provider: aws

[2016-04-19 14:27:18] INFO  [main] - c.c.l.b.v.GenericEnvironmentValidator: Validating environment Client1_DEV_CDH_Cluster Environment

[2016-04-19 14:27:18] INFO  [main] - c.c.l.m.PrivateKeySshCredentialsValidator: Validating SSH credentials for centos

[2016-04-19 14:27:18] INFO  [main] - c.c.l.p.c.PluggableComputeDeploymentTemplateValidator: Validating Cloudera Manager virtual instance template

[2016-04-19 14:27:18] INFO  [main] - c.c.l.p.c.PluggableComputeInstanceTemplateValidator: Validating instance template for compute provider: aws

[2016-04-19 14:27:18] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:19] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:20] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:21] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:21] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:22] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:22] INFO  [main] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name DevDirectorEnv_KeyPair for fingerprint

[2016-04-19 14:27:22] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing AMI 'ami-c08dafa3'

[2016-04-19 14:27:22] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing subnet 'subnet-11211174'

[2016-04-19 14:27:22] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing security group 'sg-eaed398e'

[2016-04-19 14:27:22] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing key pair

[2016-04-19 14:27:22] INFO  [main] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating deployment template: Client1_DEV_CDH_Cluster Deployment

[2016-04-19 14:27:22] INFO  [main] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating repository URLs

[2016-04-19 14:27:22] INFO  [main] - c.c.l.b.v.GenericDeploymentTemplateValidator: Validating external databases and templates

[2016-04-19 14:27:22] INFO  [main] - c.c.l.p.c.PluggableComputeClusterTemplateValidator: Validating virtual instances of cluster Client1_DEV_CDH_Cluster

[2016-04-19 14:27:22] INFO  [main] - c.c.l.p.c.PluggableComputeInstanceTemplateValidator: Validating instance template for compute provider: aws

[2016-04-19 14:27:22] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:23] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:24] INFO  [main] - c.c.director.aws.ec2.EC2Provider: >> Describing all regions to find endpoint for 'ap-southeast-2'

[2016-04-19 14:27:25] INFO  [main] - c.c.director.aws.ec2.EC2Provider: << Found endpoint 'ec2.ap-southeast-2.amazonaws.com' for region 'ap-southeast-2'

[2016-04-19 14:27:25] INFO  [main] - c.c.director.aws.ec2.EC2Provider: Found EC2 key name Damion_DevDirectorEnv_KeyPair for fingerprint

[2016-04-19 14:27:25] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing AMI 'ami-9a92b0f9'

[2016-04-19 14:27:25] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing subnet 'subnet-11211174'

[2016-04-19 14:27:25] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing security group 'sg-eaed398e'

[2016-04-19 14:27:25] INFO  [main] - c.c.d.a.e.EC2InstanceTemplateConfigurationValidator: >> Describing key pair

[2016-04-19 14:27:25] INFO  [main] - c.c.l.b.v.GenericClusterTemplateValidator: Validating parcel URL and version compatibility

[2016-04-19 14:28:44] INFO  [Thread-2] - o.s.c.a.AnnotationConfigApplicationContext: Closing org.springframework.context.annotation.AnnotationConfigApplicationContext@28e51e50: startup date [Tue Apr 19 14:26:59 EST 2016]; root of context hierarchy

[2016-04-19 14:28:44] INFO  [Thread-2] - o.s.o.j.LocalContainerEntityManagerFactoryBean: Closing JPA EntityManagerFactory for persistence unit 'default'

[2016-04-19 14:28:44] WARN  [Thread-2] - c.c.l.c.security.CipherSchemeFactory: Allowing cipher scheme to be set again, prior scheme is com.cloudera.launchpad.common.security.TripleDESCipher@1d19c0d2

 

 

 

Thanks,

 

Damion.

1 ACCEPTED SOLUTION

avatar
Rising Star

Hi,

 

I have found the issue.  Lets just say it was a case of PEBCAK.

 

My AMI's were created with 500GB /root volumes, but in my Director config file I had miss-typed the /root volume to be 50GB....

 

Modified it to 500GB in config file and re-running "bootstrap-remote".

 

 

 

Thanks,

 

Damion.

View solution in original post

2 REPLIES 2

avatar
Rising Star

Hi,

 

I have found the issue.  Lets just say it was a case of PEBCAK.

 

My AMI's were created with 500GB /root volumes, but in my Director config file I had miss-typed the /root volume to be 50GB....

 

Modified it to 500GB in config file and re-running "bootstrap-remote".

 

 

 

Thanks,

 

Damion.

avatar
Community Manager

It happens to everyone. Thanks for sharing the solution. 🙂


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.