Reply
Explorer
Posts: 8
Registered: ‎08-24-2017

Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

[ Edited ]

While creating a Cloudera Manager using the Director UI, getting the following error:

[2017-10-02 09:56:34.626 -0400] ERROR [p-5f780de113fc-DefaultBootstrapDeploymentJob] 5c96b569-181b-4afd-82d4-f339c09a9a90 POST /api/v9/environments/GPUS-DEV/deployments com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure - c.c.l.p.DatabasePipelineRunner: Encountered an unrecoverable error ErrorInfo{code=CM_AGENT_INSTALLATION_FAIL, properties={instanceIpAddress=10.20.30.71, retryCount=5}, causes=[]} in job com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure
com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Cloudera Manager agent installation failed on instance '10.20.30.71' after 5 tries.
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure.run(BootstrapClouderaManagerAgent.java:350)
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure.run(BootstrapClouderaManagerAgent.java:294)
at com.cloudera.launchpad.pipeline.job.Job5.runUnchecked(Job5.java:34)
at com.cloudera.launchpad.pipeline.job.Job5$$FastClassBySpringCGLIB$$54178505.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:60)
at sun.reflect.GeneratedMethodAccessor185.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure$$EnhancerBySpringCGLIB$$5678a7a3.runUnchecked(<generated>)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:197)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:168)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
at com.github.rholder.retry.Retryer.call(Retryer.java:160)
at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:133)
at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:164)
at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)



This was working fine on Friday.

 

 

Explorer
Posts: 8
Registered: ‎08-24-2017

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

tried using script as well, against stuck on 'Cloudera Manager to deploy agent on ...'

 

 

[ec2-user@ip-10-20-30-7 logs]$ sudo cloudera-director bootstrap /home/ec2-user/cluster.conf
Process logs can be found at /root/.cloudera-director/logs/application.log
Plugins will be loaded from /var/lib/cloudera-director-plugins
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0
Cloudera Director 2.5.1 initializing ...
Installing Cloudera Manager ...
* Starting ..... done
* Requesting an instance for Cloudera Manager .......................... done
* Installing screen package (1/1) ....... done
* Running bootstrap script #1 (crc32: 52325d91) ........... done
* Waiting until 2017-10-02T10:40:20.289-04:00 for SSH access to [10.20.30.232, ip-10-20-30-232.ec2.internal, 52.70.44.135, ec2-52-70-44-135.compute-1.amazonaws.com], default port 22 ...... done
* Inspecting capabilities of 10.20.30.232 .......... done
* Normalizing 0e5eada5-888e-41f1-b49f-529a490a7511 ..... done
* Installing ntp package (1/4) ...... done
* Installing curl package (2/4) .... done
* Installing nscd package (3/4) ...... done
* Installing gdisk package (4/4) ............................ done
* Resizing instance root partition ......... done
* Mounting all instance disk drives ............. done
* Waiting for new external database servers to start running ......... done
* Installing repositories for Cloudera Manager ....... done
* Installing oracle-j2sdk1.7 package (1/4) ...... done
* Installing yum-utils package (2/4) ..... done
* Installing cloudera-manager-daemons package (3/4) ..... done
* Installing cloudera-manager-server package (4/4) ...... done
* Setting up embedded PostgreSQL database for Cloudera Manager ..... done
* Installing cloudera-manager-server-db-2 package (1/1) ..... done
* Starting embedded PostgreSQL database ....... done
* Starting Cloudera Manager server ... done
* Waiting for Cloudera Manager server to start .... done
* Changing admin Credentials for Cloudera Manager ... done
* Setting Cloudera Manager License ... done
* Enabling Enterprise Trial ... done
* Configuring Cloudera Manager .... done
* Deploying Cloudera Manager agent ....... done
* Waiting for Cloudera Manager to deploy agent on 10.20.30.232 ..................

Explorer
Posts: 8
Registered: ‎08-24-2017

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

It seems to me the step of installing the cloudera manager agent is skipped...was there a change made to this that could have broken the process???

Cloudera Employee
Posts: 52
Registered: ‎10-28-2014

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

Have you already checked the Director server log for more details about the failure? In addition to looking for ERROR log messages and stack traces, you can look to see if Director was able to download diagnostic data from Cloudera Manager with additional information about the reason for agent installation failure.

 

Explorer
Posts: 8
Registered: ‎08-24-2017

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

Hi -

 

Where can i find the server log?

Cloudera Employee
Posts: 52
Registered: ‎10-28-2014

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

Try /var/log/cloudera-director-server/application.log. If it's not there use "sudo find . -name application.log" from some suitably high-level directory.

 

Highlighted
Cloudera Employee
Posts: 52
Registered: ‎10-28-2014

Re: Cloudera Director Fails on 'Deploying Cloudera Manager Agent'

It turns out that the agent installation log is not currently part of the diagnostic data that Director tries to download from CM. If you do not get a useful error from the Director server log, you should SSH into the CM instance and search the /tmp directory for scm_prepare_node.log. There may be more than one of these because of the retries. Those log files may contain additional information about the reason for failure.

Announcements