Created on 10-23-2015 06:06 PM - edited 09-16-2022 02:45 AM
Hi everyone,
I successfully set a Virtual Private Cloud on AWS and started an instance using the Cloudera Director AMI (ID ami-2957655e, created on October 16).
I logged into its Web panel, added an environment, and then attempted to add Cloudera Manager. I did not attempt to go as far as adding a cluster too, because I keep getting a bootstrap_failed error. This has happened a couple of times already, with the same error.
(OS: the tmplate for Manager and cluster instances uses an Amazon provisioned AMI running CentOS 6.5. The template uses t2.micro instances. Very small, I know, but this is excludsively to learn the workflow and how to use the interface. Once I see everything running smoothly, I will switch to larger sizes. What follows doesn't seem to be a memory issue anyway.)
I checked the /var/log/cloudera-director-server/application.log and I can see that the Director can SSH into the other instance, but then it throws this exception, something to do with Google's Guava library:
[2015-10-24 00:17:19] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Pipeline 1efc9b36-50f9-4eca-874d-6a736779de59 suspended due to failure java.lang.IllegalStateException: Optional.get() cannot be called on an absent value at com.google.common.base.Absent.get(Absent.java:42) ~[guava-15.0.jar!/:na] at com.cloudera.launchpad.inspector.LogInstalledPackagesAndRepositories.run(LogInstalledPackagesAndRepositories.java:37) ~[launchpad-inspector-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.inspector.LogInstalledPackagesAndRepositories.run(LogInstalledPackagesAndRepositories.java:23) ~[launchpad-inspector-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.pipeline.job.Job2.runUnchecked(Job2.java:31) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.pipeline.job.Job2$$FastClassBySpringCGLIB$$54178502.invoke(<generated>) ~[spring-core-4.1.5.RELEASE.jar!/:1.5.0] at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) ~[spring-core-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:717) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:97) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at com.cloudera.launchpad.pipeline.PipelineJobProfiler$1.call(PipelineJobProfiler.java:55) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at com.codahale.metrics.Timer.time(Timer.java:101) ~[metrics-core-3.1.0.jar!/:3.1.0] at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:51) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_67] at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_67] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:653) ~[spring-aop-4.1.5.RELEASE.jar!/:4.1.5.RELEASE] at com.cloudera.launchpad.inspector.LogInstalledPackagesAndRepositories$$EnhancerBySpringCGLIB$$5826f59a.runUnchecked(<generated>) ~[spring-core-4.1.5.RELEASE.jar!/:1.5.0] at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:165) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:136) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) ~[guava-retrying-1.0.6.jar!/:na] at com.github.rholder.retry.Retryer.call(Retryer.java:110) ~[guava-retrying-1.0.6.jar!/:na] at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:98) ~[launchpad-pipeline-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:120) ~[launchpad-pipeline-database-1.5.0.jar!/:1.5.0] at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57) [launchpad-common-1.5.0.jar!/:1.5.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_67] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_67] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_67] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_67] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67] [2015-10-24 00:17:19] INFO [pipeline-thread-1] - c.c.l.p.s.PipelineRepositoryService: Pipeline '1efc9b36-50f9-4eca-874d-6a736779de59': RUNNING -> SUSPENDED [2015-10-24 00:17:19] INFO [pipeline-thread-1] - c.c.l.d.DeploymentRepositoryService: Deployment 'Geomesa shepherd': BOOTSTRAP_FAILED -> BOOTSTRAP_FAILED
Any help is much appreciated.
If it is difficult to gauge what the problem is, maybe someone can suggest a tried and tested AMI that I could use for the template.
Thanks!
Created on 10-23-2015 06:17 PM - edited 10-23-2015 06:17 PM
With t2.micro the process of configuring Cloudera Manager will not be able to succeed. That instance type simply doesn't have enough memory for the main server process and all the management services. Our recommendation is to use m4.large or m4.xlarge. Also for Director itself you should use an instance like c3.large for best performance.
Regarding the operation system our recommendation is to use the official releases either as community AMIs or from the AWS Marketplace:
https://aws.amazon.com/marketplace/seller-profile?id=16cb8b03-256e-4dde-8f34-1b0f377efe89 (for CentOS)
This documentation page contains some more instructions on how to find an AMI:
Also see Requirements and Supported Versions for additional information:
Created on 10-23-2015 06:36 PM - edited 10-23-2015 06:36 PM
On eu-west-1 (Ireland) you could try the official CentOS 6.5 marketplace AMI with ID: ami-42718735
https://aws.amazon.com/marketplace/pp/B00IOYDTV6 (you can find it by clicking on Continue and on the Manual Launch tab)
Another option is to use RHEL 6.6 HVM community AMI with ID: ami-cf3b47b8
I found that by doing a search on the Community AMIs page for "rhel-6.6 hvm 2015".
It's good that you are using AmazonProvidedDNS because that's easiest option to start with.
Created on 10-23-2015 07:03 PM - edited 10-23-2015 07:04 PM
The plan is to run Accumulo, and on top of it, a geospatial extension called Geomesa (http://www.geomesa.org/).
I am at the "highly experimental" phase. I see that Cloudera has documentation on how to install Accumulo via Cloudera Manager, but then I will have to see how I can add the Geomesa iterator to each worker - I guess I will have to come up with a bootstrap script or something.
Edit:
Forgot to add. Not a long-running cluster now, but if the experiment succeeds, very likely in the future there will be a long running one.
Created 10-23-2015 07:10 PM
Exactly right.
Installing custom software can be automated via a boostrap script but in this case for Geomesa I think you will need some automation that runs after the cluster is configured via Director, that's true if Geomesa needs the Accumulo service up and running to start. The get Accumulo installed via Cloudera Director you will need to switch from using the UI to using a client configuration file. That allows for more control over your choice of parcel repositories and service types.
You can find configuration files examples here: https://github.com/cloudera/director-scripts/tree/master/configs
Created 10-24-2015 08:38 AM
Thanks for the advice and for the link.
Geomesa needs Accumulo to be installed already (because it needs to put some .jar files in its lib and lib/ext folders).
I will very likely need to move to using config files.