Created on 11-05-2015 11:37 AM - edited 09-16-2022 08:39 AM
Hello,
I am working on a project to automate a development cluster startup on AWS using cloudera director 1.5.1. For our purposes we would like to be able to bootstrap and terminate this cluster on a daily basis. I was successfully able to bootstrap the cluster the first time however now I am getting an error when cloudera director is bootstrapping the Cloudera Manager node. The high level logs are below.
Process logs can be found at /home/ec2-user/.cloudera-director/logs/application.log Plugins will be loaded from /var/lib/cloudera-director-client/plugins Cloudera Director 1.5.1 initializing ... Installing Cloudera Manager ... * Starting ...... done * Requesting an instance for Cloudera Manager ............................ done * Inspecting capabilities of 10.172.4.37 .......... done * Installing screen package (1/1) ........ done * Running custom bootstrap script on 10.172.4.37 ....... done * Waiting for SSH access to 10.172.4.37 on port 22 ...... done * Inspecting capabilities of 10.172.4.37 ................ done * Normalizing 10.172.4.37 ....... done * Installing ntp package (1/4) ...... done * Installing curl package (2/4) ...... done * Installing nscd package (3/4) ...... done * Installing gdisk package (4/4) ......................... done * Resizing instance root partition ............ done * Rebooting 10.172.4.37 .... done * Waiting for 10.172.4.37 to boot ...... done * Mounting all instance disk drives ........... done * Waiting for new external database servers to start running .......... done * Installing repositories for Cloudera Manager ......... done * Installing cloudera-manager-daemons package (1/2) ...... done * Installing cloudera-manager-server package (2/2) ...... done * Configuring external POSTGRESQL database for Cloudera Manager ........ done * Starting Cloudera Manager server ... done * Waiting for Cloudera Manager server to start ...... done * Setting Cloudera Manager License ... done * ClouderaManagerException{message="API call to Cloudera Manager failed. Method=ClouderaManagerResourceV6.beginTrial, Args=null",causeClass=class javax.ws.rs.BadRequestException, causeMessage="null"} ...
The issue is when cloudera director calls the begin trial API on Cloudera manager and the stack trace is included below. I don't think anything has changed since the last time this ran successfully.
[2015-11-05 10:41:59] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Encountered an unrecoverable error com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: ClouderaManagerException{message="API call to Cloudera Manager failed. Method=ClouderaManagerResourceV6.beginTrial, Args=null",causeClass=class javax.ws.rs.BadRequestException, causeMessage="null"} at com.cloudera.launchpad.bootstrap.deployment.LicenseClouderaManager.run(LicenseClouderaManager.java:84) ~[launchpad-bootstrap-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.bootstrap.deployment.LicenseClouderaManager.run(LicenseClouderaManager.java:34) ~[launchpad-bootstrap-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.pipeline.job.Job3.runUnchecked(Job3.java:32) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.pipeline.job.Job3$$FastClassBySpringCGLIB$$54178503.invoke(<generated>) ~[spring-core-4.1.5.RELEASE.jar!/:1.5.1] at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) ~[spring-core-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:717) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:97) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at com.cloudera.launchpad.pipeline.PipelineJobProfiler$1.call(PipelineJobProfiler.java:55) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at com.codahale.metrics.Timer.time(Timer.java:101) ~[metrics-core-3.1.0.jar!/:3.1.0] at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:51) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.6.0_30] at java.lang.reflect.Method.invoke(Method.java:622) ~[na:1.6.0_30] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:653) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at com.cloudera.launchpad.bootstrap.deployment.LicenseClouderaManager$$EnhancerBySpringCGLIB$$4c61360c.runUnchecked(<generated>) ~[spring-core-4.1.5.RELEASE.jar!/:1.5.1] at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:165) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:136) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) ~[guava-retrying-1.0.6.jar!/:na] at com.github.rholder.retry.Retryer.call(Retryer.java:110) ~[guava-retrying-1.0.6.jar!/:na] at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:98) ~[launchpad-pipeline-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:120) ~[launchpad-pipeline-database-1.5.1.jar!/:1.5.1] at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57) ~[launchpad-common-1.5.1.jar!/:1.5.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.6.0_30] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) ~[na:1.6.0_30] at java.util.concurrent.FutureTask.run(FutureTask.java:166) ~[na:1.6.0_30] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) ~[na:1.6.0_30] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.6.0_30] at java.lang.Thread.run(Thread.java:701) ~[na:1.6.0_30] Caused by: com.cloudera.api.ext.ClouderaManagerException: API call to Cloudera Manager failed. Method=ClouderaManagerResourceV6.beginTrial, Args=null at com.cloudera.api.ext.ClouderaManagerClientProxy.invoke(ClouderaManagerClientProxy.java:79) ~[launchpad-cloudera-manager-api-ext-1.5.1.jar!/:na] at com.sun.proxy.$Proxy146.beginTrial(Unknown Source) ~[na:na] at com.cloudera.launchpad.bootstrap.deployment.LicenseClouderaManager.run(LicenseClouderaManager.java:77) ~[launchpad-bootstrap-1.5.1.jar!/:1.5.1] ... 34 common frames omitted
Any help would be appreciated,
Thanks
Created 12-03-2015 11:40 AM
The cause of this error was determined to be the externally managed database. This database must be completely cleared manually between bootstrap attempts in order to execute correctly.
One solution to fix this would be to migrate to use database templates instead of setting up the database beforehand.
Created 11-05-2015 02:07 PM
HI,
I would check your AWS VPC network policys and make sure the Director node can talk to the CM node over all ports on the local subnet.
Created 11-05-2015 02:30 PM
We have a security group setup to allow all inbound and outbound connections. Also this started correctly before using the same VPC, subnet, and security group so I don't think it is an AWS connection issue.
Created 12-03-2015 11:40 AM
The cause of this error was determined to be the externally managed database. This database must be completely cleared manually between bootstrap attempts in order to execute correctly.
One solution to fix this would be to migrate to use database templates instead of setting up the database beforehand.