Created on 12-16-2018 07:47 AM - edited 09-16-2022 06:59 AM
Hello,
Am trying to install Apache Spot using Apache Spot in 60 minutes (https://blog.cloudera.com/blog/2018/02/apache-spot-incubating-and-cloudera-on-aws-in-60-minutes/)
Have done the following successfully and run into errors during the installation of daemons
Environment: AWS - spot instance, m4-2XL, centos 7.5
cd /etc/yum.repos.d
sudo wget "http://archive.cloudera.com/director6/6.0/redhat7/cloudera-director.repo"
sudo yum --disablerepo="*" --enablerepo=cloudera-manager list available
sudo mkdir /usr/java
sudo rpm -ivh /tmp/jdk-8u191-linux-x64.rpm
sudo yum install cloudera-director-client
sudo yum install cloudera-director-server
git clone "https://github.com/hdulay/apache-spot-60-min"
cd apache-spot-60-min
sudo service cloudera-director-server start
##**Make sure IAM role is associated with the instance
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/<iam_role>
EXPORT the variables
export aws_region=us-west-2
export aws_subnet=xxx
export aws_security_group=xxx
export path_to_private_key=/home/centos/.ssh/id_rsa/spot.pem
export aws_owner= 12345
export aws_ami=ami-00f7c900d2e7133e1
export USER=centos
cloudera-director bootstrap-remote aws.conf --lp.remote.hostAndPort=127.0.0.1:7189 --lp.remote.username=admin --lp.remote.password=admin
Process logs can be found at /home/centos/.cloudera-director/logs/application.log
Plugins will be loaded from /var/lib/cloudera-director-plugins
Cloudera Altus Director 6.0.0 initializing ...
Connecting to http://localhost:7189
Current user roles: [ROLE_READONLY, ROLE_ADMIN]
Creating a new environment...
Creating external database servers if configured...
Creating a new Cloudera Manager...
Creating a new CDH cluster...
* Requesting an instance for Cloudera Manager ......... done
* Installing screen package (1/1) ... done
* Inspecting capabilities of 172.31.10.251 ... done
* Normalizing c5879270-2e9a-45ae-8167-e7293a3c6a49 ... done
* Installing ntp package (1/5) .... done
* Installing curl package (2/5) .... done
* Installing nscd package (3/5) .... done
* Installing rng-tools package (4/5) .... done
* Installing gdisk package (5/5) ........... done
* Resizing instance root partition .... done
* Mounting all instance disk drives .... done
* Running csd installation script on [172.31.10.xxx, ip-172-31-xx-xxx.us-west-2.compute.internal, xx.222.165.xx, ec2-xx-222-165-xx.us-west-2.compute.amazonaws.com] ... done
* Installing repositories for Cloudera Manager ... done
* Installing yum-utils package (1/1) ...... done
* Installing yum-utils package (1/3) ... done
* Installing cloudera-manager-daemons package (2/3) .... done
* Suspended due to failure ...
Errors from the cloudera-server logs (/var/log/cloudera-server/application.log)
INFO [Timer-0] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Removing ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$
.ssh.SshClientPoolFactory$EndpointsInfo@3611588c[[172.31.10.xxx:xx, ip-172-31-1x-xxx.us-west-2.compute.internal:22, xx.222.165.10:xx, ec2-34-222-165-10.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.576 +0000] INFO [io-thread-2] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@4a3056d[[172.31.xx.255:22, ip-172-xx.xx-255.us-west-2.compute.internal:22, xx.27.68.xxx:22, ec2-xx-27-68-1xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.576 +0000] INFO [io-thread-3] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@a535985[[172.31.x.xxx:22, ip-172-31-x-xxx.us-west-2.compute.internal:22, 18.237.xx.xxx:22, ec2-18-xxx-xx-2xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.576 +0000] INFO [io-thread-1] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@707cfddd[[172.31.xx.xxx:22, ip-172-xx-1x-xxx.us-west-2.compute.internal:22, xx.221.xx.xxx:22, ec2-xx-221-xx-xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.576 +0000] INFO [io-thread-9] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@28765354[[172.31.1x.xxx:22, ip-172-31-xx-xxx.us-west-2.compute.internal:22, xx.214.xxx.254:22, ec2-xx-214-xxx-254.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.577 +0000] INFO [io-thread-2] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@6dba3d18[[172.31.xx.xxx:22, ip-172-31-xx-xxx.us-west-2.compute.internal:22, xx.32.xxx.xx:22, ec2-xx-32-xxx-xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.577 +0000] INFO [io-thread-1] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@3611588c[[172.31.xx.xxx:22, ip-172-31-xx-xxx.us-west-2.compute.internal:22, xx.222.xxx.xx:22, ec2-x-222-xxx-xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:42.577 +0000] INFO [io-thread-3] - - - - - c.c.l.c.ssh.SshClientPoolFactory: Successfully closed ssh client to endpoint(s): com.cloudera.launchpad.common.ssh.SshClientPoolFactory$EndpointsInfo@3611588c[[172.31.10.251:22, ip-172-31-xx-xxx.us-west-2.compute.internal:22, xx.222.1xx.xx:22, ec2-xx-222-1xx-xx.us-west-2.compute.amazonaws.com:22],SshCredentials{username='centos', hasPassword=false, hasPrivateKey=true, hasPassphrase=false, port=22, hostKeyFingerprint=Optional.absent(), bastionHost=Optional.absent()}]
[2018-12-16 14:29:43.522 +0000] INFO [task-thread-5] - - - - - c.c.l.task.RefreshDeployments: Refreshing pre-existing Deployments
[2018-12-16 14:29:43.522 +0000] INFO [task-thread-5] - - - - - c.c.l.task.RefreshDeployments: Skipping refresh of deployment apache-spot Environment:apache-spot Deployment as it is in transition or terminated. (Stage: BOOTSTRAP_FAILED, deploymentIsNull: true)
[2018-12-16 14:29:43.522 +0000] INFO [task-thread-5] - - - - - c.c.l.task.RefreshDeployments: Finished refreshing all pre-existing Deployment models
[2018-12-16 14:29:43.527 +0000] INFO [task-thread-8] - - - - - c.c.launchpad.task.RefreshClusters: Refreshing Cluster models
[2018-12-16 14:29:43.534 +0000] INFO [task-thread-8] - - - - - c.c.launchpad.task.RefreshClusters: Skipping refresh of cluster apache-spot Environment:apache-spot Deployment:apache-spot as it is in transition, terminated, or has failed updating. (Stage: BOOTSTRAP_FAILED, clusterIsNull: true)
Can any one please help.
-Jyo
Created 12-17-2018 06:53 AM
Hi Jyo. This section of the log doesn't appear to have the relevant information we need. Could you look for the string BOOTSTRAP_FAILED in your log and see if you see an error above that that could be more edifying? Especailly surrounding the installation of cloudera-manager-daemons, as it seems like this is when it failed. It could be a networking issue.
Does this happen consistently when trying to bootstrap a cluster?
Created 12-17-2018 04:22 PM
Created 12-18-2018 06:44 AM
Please find the output from another install on RHEL 7.6
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Scalar-List-Utils.x86_64 0:1.27-248.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Socket.x86_64 0:2.010-4.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Storable.x86_64 0:2.45-3.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Text-ParseWords.noarch 0:3.29-4.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Time-HiRes.x86_64 4:1.9725-3.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-Time-Local.noarch 0:1.2300-2.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-constant.noarch 0:1.27-2.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-libs.x86_64 4:5.16.3-293.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-macros.x86_64 4:5.16.3-293.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-parent.noarch 1:0.225-244.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-podlators.noarch 0:2.5.1-3.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-threads.x86_64 0:1.87-4.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: perl-threads-shared.x86_64 0:1.43-6.el7
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: Failed:
[2018-12-18 14:31:45.288 +0000] INFO [io-thread-3] - - - - - ssh:172.31.25.238: cloudera-manager-daemons.x86_64 0:6.0.1-610811.el7
[2018-12-18 14:31:45.293 +0000] ERROR [p-3b331f667d4b-DefaultBootstrapDeploymentJob] 1fefa489-5295-4fd0-9c70-5777be5ff07d POST /api/d6.0/environments/apache-spot%20Environment/deployments com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed
com.cloudera.launchpad.common.ssh.SshException: Script execution failed with code 1. Script: sudo yum -C list installed 'cloudera-manager-daemons' 2>&1 > /dev/null && echo "Package cloudera-manager-daemons is already installed and upgrades are not forced. Skipping." || sudo yum install -d 1 --assumeyes 'cloudera-manager-daemons'
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:46)
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:28)
at com.cloudera.launchpad.pipeline.job.Job3.runUnchecked(Job3.java:32)
at com.cloudera.launchpad.pipeline.job.Job3$$FastClassBySpringCGLIB$$54178503.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:88)
at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:60)
at sun.reflect.GeneratedMethodAccessor152.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:644)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:633)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging$$EnhancerBySpringCGLIB$$2050bd1f.runUnchecked(<generated>)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:202)
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:173)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
at com.github.rholder.retry.Retryer.call(Retryer.java:160)
at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:136)
at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.doRun(DatabasePipelineRunner.java:214)
at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:154)
at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2018-12-18 14:31:55.804 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Error: No matching Packages to list
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: ================================================================================
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Package Arch Version Repository Size
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: ================================================================================
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Installing:
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: cloudera-manager-daemons x86_64 6.0.1-610811.el7 cloudera-manager 1.0 G
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Transaction Summary
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: ================================================================================
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Install 1 Package
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Transaction Summary
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: ================================================================================
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Install 1 Package
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Total download size: 1.0 G
[2018-12-18 14:31:56.906 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Installed size: 1.2 G
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: +======================================================================+
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | Error: Unable to find a compatible version of Java on this host,|
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | either because JAVA_HOME has not been set or because a |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | compatible version of Java is not installed. |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: +----------------------------------------------------------------------+
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | Please download a supported version of the Oracle JDK from the |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | Oracle Java web site: |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | > http://www.oracle.com/technetwork/java/javase/index.html < |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | Cloudera Manager requires Oracle JDK 1.8 or later. |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | NOTE: Cloudera Manager will find the Oracle JDK when starting, |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | regardless of whether you installed the JDK using a binary |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: | installer or the RPM-based installer. |
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: +======================================================================+
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: error: %pre(cloudera-manager-daemons-6.0.1-610811.el7.x86_64) scriptlet failed, exit status 1
[2018-12-18 14:32:17.536 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Error in PREIN scriptlet in rpm package cloudera-manager-daemons-6.0.1-610811.el7.x86_64
[2018-12-18 14:32:17.708 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: Failed:
[2018-12-18 14:32:17.708 +0000] INFO [io-thread-9] - - - - - ssh:172.31.25.238: cloudera-manager-daemons.x86_64 0:6.0.1-610811.el7
[2018-12-18 14:32:17.709 +0000] ERROR [p-3b331f667d4b-DefaultBootstrapDeploymentJob] 1fefa489-5295-4fd0-9c70-5777be5ff07d POST /api/d6.0/environments/apache-spot%20Environment/deployments com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging - c.c.l.pipeline.util.PipelineRunner: Attempt to execute job failed
com.cloudera.launchpad.common.ssh.SshException: Script execution failed with code 1. Script: sudo yum -C list installed 'cloudera-manager-daemons' 2>&1 > /dev/null && echo "Package cloudera-manager-daemons is already installed and upgrades are not forced. Skipping." || sudo yum install -d 1 --assumeyes 'cloudera-manager-daemons'
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:46)
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:28)
at com.cloudera.launchpad.pipeline.job.Job3.runUncheck
Created 12-19-2018 06:52 AM