Support Questions

autodidacticon · ‎01-28-2016

Attempting to bootstrap cloudera manager and a cluster; all nodes are failing when the bootstrap attempts to install the 'screen' package. Have attempted with the following two aws AMI's:

ami-414b7271 (RHEL 6.6, default option for the c34 template)
ami-11125e21 (RHEL 6.5)

[2016-01-28 16:13:56] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Pipeline 33280be1-0db3-403e-b15f-e59c9b10a1ca suspended due to failure
com.cloudera.launchpad.common.ssh.SshException: Script execution failed with code 1. Script: sudo yum -C list installed 'screen' 2>&1 > /dev/null && echo "Package screen is already
installed and upgrades are not forced.  Skipping." || sudo yum install -d 1 --assumeyes 'screen'
	at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:45) ~[launchpad-pipeline-common-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:27) ~[launchpad-pipeline-common-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.job.Job3.runUnchecked(Job3.java:32) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.job.Job3$$FastClassBySpringCGLIB$$54178503.invoke(<generated>) ~[spring-core-4.1.6.RELEASE.jar!/:2.0.0]
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) ~[spring-core-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:717) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:97) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at com.cloudera.launchpad.pipeline.PipelineJobProfiler$1.call(PipelineJobProfiler.java:67) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at com.codahale.metrics.Timer.time(Timer.java:101) ~[metrics-core-3.1.0.jar!/:3.1.0]
	at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:63) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source) ~[na:na]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_65]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_65]
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:653) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
	at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging$$EnhancerBySpringCGLIB$$6f647027.runUnchecked(<generated>) ~[spring-core-4.1.6.RELEASE.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:159) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:130) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) ~[guava-retrying-1.0.6.jar!/:na]
	at com.github.rholder.retry.Retryer.call(Retryer.java:110) ~[guava-retrying-1.0.6.jar!/:na]
	at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:99) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:125) ~[launchpad-pipeline-database-2.0.0.jar!/:2.0.0]
	at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57) [launchpad-common-2.0.0.jar!/:2.0.0]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_65]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_65]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_65]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65]
	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
[2016-01-28 16:13:56] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Pipeline '33280be1-0db3-403e-b15f-e59c9b10a1ca' failed
	at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging$$EnhancerBySpringCGLIB$$6f647027
	at com.cloudera.launchpad.bootstrap.InstallPackages.InstallOrUpgradePackage:1

[2016-01-28 16:13:56] INFO  [pipeline-thread-1] - c.c.l.p.s.PipelineRepositoryService: Pipeline '33280be1-0db3-403e-b15f-e59c9b10a1ca': RUNNING -> SUSPENDED
[2016-01-28 16:13:56] INFO  [pipeline-thread-1] - c.c.l.d.DeploymentRepositoryService: Deployment 'manager': BOOTSTRAPPING -> BOOTSTRAP_FAILED

jadair · ‎01-28-2016

If you ssh into the instance and try to execute those commands manually, it should be apparent what is going wrong. Could be network configuration issues trying to talk to the yum repo, etc.

View solution in original post

jadair · ‎01-28-2016

If you ssh into the instance and try to execute those commands manually, it should be apparent what is going wrong. Could be network configuration issues trying to talk to the yum repo, etc.

autodidacticon · ‎01-29-2016

Our NAT instance was configured incorrectly. Switching to AWS' new NAT Gateway within the VPC wizard resolved this issue.

jadair · ‎01-29-2016

I'm glad my suggestion for how to diagnose was helpful. Thanks for commenting on what turned out to be the specific problem, in case your solution is helpful to another user in the future.

autodidacticon · ‎01-29-2016

The specific error we received:

could not contact CDS load balancer rhui2-cds01.us-west-2.aws.ce.redhat.com

I was using the following cloudformation template for configuring our cluster:

http://docs.aws.amazon.com/quickstart/latest/cloudera/step2b.html

It provisions a NAT instance which traffic to the private subnet is proxied through; the private subnet is where the cloudera instances are deployed.

What's not clear is why the original NAT instance wasn't providing outbound access to the internet.

The NAT instance was removed and replaced with a NAT gateway which is a newer AWS product:

https://aws.amazon.com/blogs/aws/new-managed-nat-network-address-translation-gateway-for-aws/

Replacing the original NAT instance entries on the private subnet's routing table with the identifier of the NAT gateway resolved the issue.

escapedcanadian · ‎11-08-2016

I experienced this issue when setting the parameter

associatePublicIpAddresses: false

The default seems to be 'true'.

The notes for this parameter say ...

# Whether to associate a public IP address with instances or not. If this is false
# we expect instances to be able to access the internet using a NAT instance
#
# Currently the only way to get optimal S3 data transfer performance is to assign
# public IP addresses to your instances and not use NAT (public subnet type of setup)
#
# See: http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-ip-addressing.html

So it makes sense that if you attempt to set this parameter to false that you may need to configure NAT as mentioned above.

Cloudera Community

Support Questions

Cloudera Manager bootstrap failing on 'Installing screen package'

Failed to install cloudera-manager-agent package

Failed to install cloudera-manager-agent package

failed to install cloudera manager agent package

Failed to install cloudera-manager-agent package o...

Cloudera director bootstrap failure: Cloudera Mana...

cloudera director bootstrap failure: Cloudera Mana...

Cloudera Manager bootstrap fails for UBUNTU

How to Simplify Spark-Submit JAR Dependency Manage...

Director fails to bootstrap cloudera manager - com...

Failed to install jdk package