Created on 10-02-2017 07:02 AM - edited 10-02-2017 07:05 AM
While creating a Cloudera Manager using the Director UI, getting the following error:
[2017-10-02 09:56:34.626 -0400] ERROR [p-5f780de113fc-DefaultBootstrapDeploymentJob] 5c96b569-181b-4afd-82d4-f339c09a9a90 POST /api/v9/environments/GPUS-DEV/deployments com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure - c.c.l.p.DatabasePipelineRunner: Encountered an unrecoverable error ErrorInfo{code=CM_AGENT_INSTALLATION_FAIL, properties={instanceIpAddress=10.20.30.71, retryCount=5}, causes=[]} in job com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure
com.cloudera.launchpad.pipeline.UnrecoverablePipelineError: Cloudera Manager agent installation failed on instance '10.20.30.71' after 5 tries.
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure.run(BootstrapClouderaManagerAgent.java:350)
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure.run(BootstrapClouderaManagerAgent.java:294)
at com.cloudera.launchpad.pipeline.job.Job5.runUnchecked(Job5.java:34)
at com.cloudera.launchpad.pipeline.job.Job5$$FastClassBySpringCGLIB$$54178505.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:60)
at sun.reflect.GeneratedMethodAccessor185.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656)
at com.cloudera.launchpad.bootstrap.cluster.BootstrapClouderaManagerAgent$WaitForSuccessOrRetryOnFailure$$EnhancerBySpringCGLIB$$5678a7a3.runUnchecked(<generated>)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:197)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:168)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
at com.github.rholder.retry.Retryer.call(Retryer.java:160)
at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:133)
at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:164)
at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
This was working fine on Friday.
Created 10-02-2017 07:44 AM
tried using script as well, against stuck on 'Cloudera Manager to deploy agent on ...'
[ec2-user@ip-10-20-30-7 logs]$ sudo cloudera-director bootstrap /home/ec2-user/cluster.conf
Process logs can be found at /root/.cloudera-director/logs/application.log
Plugins will be loaded from /var/lib/cloudera-director-plugins
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0
Cloudera Director 2.5.1 initializing ...
Installing Cloudera Manager ...
* Starting ..... done
* Requesting an instance for Cloudera Manager .......................... done
* Installing screen package (1/1) ....... done
* Running bootstrap script #1 (crc32: 52325d91) ........... done
* Waiting until 2017-10-02T10:40:20.289-04:00 for SSH access to [10.20.30.232, ip-10-20-30-232.ec2.internal, 52.70.44.135, ec2-52-70-44-135.compute-1.amazonaws.com], default port 22 ...... done
* Inspecting capabilities of 10.20.30.232 .......... done
* Normalizing 0e5eada5-888e-41f1-b49f-529a490a7511 ..... done
* Installing ntp package (1/4) ...... done
* Installing curl package (2/4) .... done
* Installing nscd package (3/4) ...... done
* Installing gdisk package (4/4) ............................ done
* Resizing instance root partition ......... done
* Mounting all instance disk drives ............. done
* Waiting for new external database servers to start running ......... done
* Installing repositories for Cloudera Manager ....... done
* Installing oracle-j2sdk1.7 package (1/4) ...... done
* Installing yum-utils package (2/4) ..... done
* Installing cloudera-manager-daemons package (3/4) ..... done
* Installing cloudera-manager-server package (4/4) ...... done
* Setting up embedded PostgreSQL database for Cloudera Manager ..... done
* Installing cloudera-manager-server-db-2 package (1/1) ..... done
* Starting embedded PostgreSQL database ....... done
* Starting Cloudera Manager server ... done
* Waiting for Cloudera Manager server to start .... done
* Changing admin Credentials for Cloudera Manager ... done
* Setting Cloudera Manager License ... done
* Enabling Enterprise Trial ... done
* Configuring Cloudera Manager .... done
* Deploying Cloudera Manager agent ....... done
* Waiting for Cloudera Manager to deploy agent on 10.20.30.232 ..................
Created 10-02-2017 09:49 AM
It seems to me the step of installing the cloudera manager agent is skipped...was there a change made to this that could have broken the process???
Created 10-03-2017 09:08 AM
Have you already checked the Director server log for more details about the failure? In addition to looking for ERROR log messages and stack traces, you can look to see if Director was able to download diagnostic data from Cloudera Manager with additional information about the reason for agent installation failure.
Created 10-03-2017 09:10 AM
Hi -
Where can i find the server log?
Created 10-03-2017 09:15 AM
Try /var/log/cloudera-director-server/application.log. If it's not there use "sudo find . -name application.log" from some suitably high-level directory.
Created 10-03-2017 09:52 AM
It turns out that the agent installation log is not currently part of the diagnostic data that Director tries to download from CM. If you do not get a useful error from the Director server log, you should SSH into the CM instance and search the /tmp directory for scm_prepare_node.log. There may be more than one of these because of the retries. Those log files may contain additional information about the reason for failure.
Created 11-10-2019 10:22 AM
Hello Friends,
I am facing the same issue "Cloudera Manager agent installation failed on instance '10.3.0.5' after 5 tries" when trying to deploy Cloudera Manager using Altus Director 6.3.
I have tried using Cloudera CentOS image 7.4 and Rhel 7.4.
I have checked the application.log and could find only issue"Cloudera Manager agent installation failed on instance '10.3.0.5' after 5 tries".
But as suggested by "jadair" I am attaching the scm_prepare_node.log from the CM host.
Kindly go through the scm_prepare_node.log and help me resolve this issue!Thanks in advance!