Member since
01-28-2016
9
Posts
1
Kudos Received
0
Solutions
02-04-2016
08:04 PM
Thanks; our incoming CIDR's were incorrect.
... View more
02-03-2016
02:48 PM
Attempting to provision a new environment containing cluster of 3 Kafka and Zookeeper nodes using the cloudera-director bootstrap-remote CLI utility. The bootstrap process fails with the following stack trace: http://pastebin.com/Qyg9bL96 However, when a new cluster is created from the director UI, the process succeeds. Here is our configuration file: name: Atlas-Test
provider {
type: aws
publishAccessKeys: true
region: us-west-2
subnetId: subnet-daf4c9ad
securityGroupsIds: sg-feae6f99
instanceNamePrefix: test
associatePublicIpAddresses: false
}
ssh {
username: ec2-user
privateKey: /home/ec2-user/atlas.pem
}
instances {
default {
type: m4.large
image: ami-414b7271
tags {
owner: ${?USER}
}
}
}
cloudera-manager {
instance: ${instances.default} {
tags {
application: "Cloudera Manager 5"
}
}
enableEnterpriseTrial: true
}
cluster {
products {
CDH: 5
}
services: [ZOOKEEPER, KAFKA]
master {
count: 3
instance: ${instances.default} {
tags {
group: master
}
}
roles {
ZOOKEEPER: [SERVER]
}
}
gateway {
count: 3
instance: ${instances.default} {
tags {
group: kafka
}
}
roles {
KAFKA: [KAFKA_BROKER]
}
}
}
... View more
02-01-2016
10:26 AM
Here is the tail of one of the application nodes: [01/Feb/2016 12:11:36 +0000] 3034 MainThread agent ERROR Failed to connect to previous supervisor.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1635, in find_or_start_supervisor
self.configure_supervisor_clients()
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 1882, in configure_supervisor_clients
supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")])
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 1564, in realize
Options.realize(self, *arg, **kw)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 311, in realize
self.process_config()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 319, in process_config
self.process_config_file(do_usage)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 354, in process_config_file
self.usage(str(msg))
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 142, in usage
self.exit(2)
SystemExit: 2
[01/Feb/2016 12:11:36 +0000] 3034 MainThread tmpfs INFO Successfully mounted tmpfs at /var/run/cloudera-scm-agent/process
[01/Feb/2016 12:11:39 +0000] 3034 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1)
[01/Feb/2016 12:11:40 +0000] 3034 MainThread agent INFO Supervisor version: 3.0
[01/Feb/2016 12:11:40 +0000] 3034 MainThread agent INFO Successfully connected to supervisor
[01/Feb/2016 12:11:40 +0000] 3034 MainThread status_server INFO Using maximum impala profile bundle size of 1073741824 bytes.
[01/Feb/2016 12:11:40 +0000] 3034 MainThread status_server INFO Using maximum stacks log bundle size of 1073741824 bytes.
[01/Feb/2016 12:11:40 +0000] 3034 MainThread _cplogging INFO [01/Feb/2016:12:11:40] ENGINE Bus STARTING
[01/Feb/2016 12:11:40 +0000] 3034 MainThread _cplogging INFO [01/Feb/2016:12:11:40] ENGINE Started monitor thread '_TimeoutMonitor'.
[01/Feb/2016 12:11:41 +0000] 3034 MainThread _cplogging INFO [01/Feb/2016:12:11:41] ENGINE Serving on ip-172-31-39-199.us-west-2.compute.internal:9000
[01/Feb/2016 12:11:41 +0000] 3034 MainThread _cplogging INFO [01/Feb/2016:12:11:41] ENGINE Bus STARTED
[01/Feb/2016 12:11:41 +0000] 3034 MainThread __init__ INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x34ab1d0>,)
[01/Feb/2016 12:11:41 +0000] 3034 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('HostMonitor',)
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Setting default socket timeout to 30
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Created /opt/cloudera/parcels
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Chowning /opt/cloudera/parcels to root (0) root (0)
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Chmod'ing /opt/cloudera/parcels to 0755
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Created /opt/cloudera/parcel-cache
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Chowning /opt/cloudera/parcel-cache to root (0) root (0)
[01/Feb/2016 12:11:41 +0000] 3034 MainThread agent INFO Chmod'ing /opt/cloudera/parcel-cache to 0755
[01/Feb/2016 12:11:41 +0000] 3034 MainThread parcel INFO Agent does create users/groups and apply file permissions
[01/Feb/2016 12:11:41 +0000] 3034 MainThread downloader INFO Downloader path: /opt/cloudera/parcel-cache
[01/Feb/2016 12:11:41 +0000] 3034 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache
[01/Feb/2016 12:11:42 +0000] 3034 MainThread firehoses INFO Reporting interval updated: 5.0 -> 60
[01/Feb/2016 12:11:42 +0000] 3034 MainThread agent INFO Active parcel list updated; recalculating component info.
[01/Feb/2016 12:11:56 +0000] 3034 CP Server Thread-4 _cplogging INFO 172.31.47.109 - - [01/Feb/2016:12:11:56] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0"
[01/Feb/2016 12:12:11 +0000] 3034 DnsResolutionMonitor throttling_logger INFO Using java location: '/usr/java/jdk1.7.0_67-cloudera/bin/java'.
[01/Feb/2016 12:12:27 +0000] 3034 CP Server Thread-5 _cplogging INFO 172.31.47.109 - - [01/Feb/2016:12:12:27] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0"
[01/Feb/2016 12:12:27 +0000] 3034 Thread-13 downloader INFO Starting download of: http://ip-172-31-47-109.us-west-2.compute.internal:7180/cmf/parcel/download/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel
[01/Feb/2016 12:12:28 +0000] 3034 Thread-13 downloader INFO Completed download of http://ip-172-31-47-109.us-west-2.compute.internal:7180/cmf/parcel/download/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel code=200 state=downloaded
[01/Feb/2016 12:12:28 +0000] 3034 Thread-13 parcel_cache INFO Checking checksum of parcel KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel...
[01/Feb/2016 12:12:28 +0000] 3034 Thread-13 parcel_cache INFO Unpacking /opt/cloudera/parcel-cache/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel into /opt/cloudera/parcels
[01/Feb/2016 12:12:29 +0000] 3034 Thread-13 parcel_cache INFO Unpack of parcel /opt/cloudera/parcel-cache/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel successful
[01/Feb/2016 12:12:29 +0000] 3034 Thread-13 downloader INFO Finished download [ url: http://ip-172-31-47-109.us-west-2.compute.internal:7180/cmf/parcel/download/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel, state: complete, total_bytes: 37155105, downloaded_bytes: 37155105, start_time: 2016-02-01 12:12:27, download_end_time: 2016-02-01 12:12:28, end_time: 2016-02-01 12:12:29, code: 200, exception_msg: None, path: /opt/cloudera/parcel-cache/KAFKA-0.8.2.0-1.kafka1.4.0.p0.56-el6.parcel ]
[01/Feb/2016 12:12:42 +0000] 3034 MonitorDaemon-Reporter firehoses INFO Creating a connection to the SERVICEMONITOR.
[01/Feb/2016 12:12:42 +0000] 3034 MonitorDaemon-Reporter firehoses INFO Creating a connection to the HOSTMONITOR.
[01/Feb/2016 12:12:42 +0000] 3034 MainThread parcel INFO Loading parcel manifest for: KAFKA-0.8.2.0-1.kafka1.4.0.p0.56
[01/Feb/2016 12:12:43 +0000] 3034 MainThread parcel INFO Ensuring users/groups exist for new parcel KAFKA-0.8.2.0-1.kafka1.4.0.p0.56.
[01/Feb/2016 12:12:43 +0000] 3034 MainThread parcel INFO Executing command ['/usr/sbin/groupadd', '-r', 'kafka']
[01/Feb/2016 12:12:47 +0000] 3034 MainThread parcel INFO Executing command ['/usr/sbin/groupadd', '-r', 'kafka']
[01/Feb/2016 12:12:47 +0000] 3034 MainThread parcel INFO Executing command ['/usr/sbin/useradd', '-r', '-m', '-g', 'kafka', '-K', 'UMASK=022', '--home', '/var/lib/kafka', '--comment', 'Kafka', '--shell', '/sbin/nologin', 'kafka']
[01/Feb/2016 12:12:48 +0000] 3034 MainThread parcel INFO Ensuring correct file permissions for new parcel KAFKA-0.8.2.0-1.kafka1.4.0.p0.56.
... View more
02-01-2016
09:33 AM
Installing kafka, flume on two m4.2 instances and bootstrap seems to have hung here: [2016-02-01 12:30:47] INFO [pipeline-thread-42] - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (CDH, 5.5.1-1.cdh5.5.1.p0.11) stage DISTRIBUTED. Current: DOWNLOADED Past: [DOWNLOADED, DOWNLOADED, DOWNLOADED]. State ApiParcelState{progress=100, progressTotal=100, count=1, countTotal=1, warnings=null, errors=null}
... View more
Labels:
01-31-2016
02:51 PM
I meant to ask, is there any plans for amazon linux support in the future?
... View more
01-29-2016
01:48 PM
I've attempted to provision cloudera manager with the AMI's from the following: https://aws.amazon.com/amazon-linux-ami/ I receive the following errors: Only certain Linux platforms are supported. See the "Supported Distributions and Resource Requirements" section of the Cloudera Director User Guide. Invalid owner Id for AMI ami-f0091d91: 137112412989 (Amazon Linux)
... View more
Labels:
- Labels:
-
Cloudera Manager
01-29-2016
12:36 PM
The specific error we received: could not contact CDS load balancer rhui2-cds01.us-west-2.aws.ce.redhat.com I was using the following cloudformation template for configuring our cluster: http://docs.aws.amazon.com/quickstart/latest/cloudera/step2b.html It provisions a NAT instance which traffic to the private subnet is proxied through; the private subnet is where the cloudera instances are deployed. What's not clear is why the original NAT instance wasn't providing outbound access to the internet. The NAT instance was removed and replaced with a NAT gateway which is a newer AWS product: https://aws.amazon.com/blogs/aws/new-managed-nat-network-address-translation-gateway-for-aws/ Replacing the original NAT instance entries on the private subnet's routing table with the identifier of the NAT gateway resolved the issue.
... View more
01-29-2016
11:50 AM
1 Kudo
Our NAT instance was configured incorrectly. Switching to AWS' new NAT Gateway within the VPC wizard resolved this issue.
... View more
01-28-2016
02:08 PM
Attempting to bootstrap cloudera manager and a cluster; all nodes are failing when the bootstrap attempts to install the 'screen' package. Have attempted with the following two aws AMI's: ami-414b7271 (RHEL 6.6, default option for the c34 template) ami-11125e21 (RHEL 6.5) [2016-01-28 16:13:56] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Pipeline 33280be1-0db3-403e-b15f-e59c9b10a1ca suspended due to failure
com.cloudera.launchpad.common.ssh.SshException: Script execution failed with code 1. Script: sudo yum -C list installed 'screen' 2>&1 > /dev/null && echo "Package screen is already
installed and upgrades are not forced. Skipping." || sudo yum install -d 1 --assumeyes 'screen'
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:45) ~[launchpad-pipeline-common-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging.run(SshJobFailFastWithOutputLogging.java:27) ~[launchpad-pipeline-common-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.job.Job3.runUnchecked(Job3.java:32) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.job.Job3$$FastClassBySpringCGLIB$$54178503.invoke(<generated>) ~[spring-core-4.1.6.RELEASE.jar!/:2.0.0]
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) ~[spring-core-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:717) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:97) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at com.cloudera.launchpad.pipeline.PipelineJobProfiler$1.call(PipelineJobProfiler.java:67) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at com.codahale.metrics.Timer.time(Timer.java:101) ~[metrics-core-3.1.0.jar!/:3.1.0]
at com.cloudera.launchpad.pipeline.PipelineJobProfiler.profileJobRun(PipelineJobProfiler.java:63) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_65]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_65]
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:653) ~[spring-aop-4.1.6.RELEASE.jar!/:4.1.6.RELEASE]
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging$$EnhancerBySpringCGLIB$$6f647027.runUnchecked(<generated>) ~[spring-core-4.1.6.RELEASE.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:159) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.util.PipelineRunner$JobCallable.call(PipelineRunner.java:130) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) ~[guava-retrying-1.0.6.jar!/:na]
at com.github.rholder.retry.Retryer.call(Retryer.java:110) ~[guava-retrying-1.0.6.jar!/:na]
at com.cloudera.launchpad.pipeline.util.PipelineRunner.attemptMultipleJobExecutionsWithRetries(PipelineRunner.java:99) ~[launchpad-pipeline-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.pipeline.DatabasePipelineRunner.run(DatabasePipelineRunner.java:125) ~[launchpad-pipeline-database-2.0.0.jar!/:2.0.0]
at com.cloudera.launchpad.ExceptionHandlingRunnable.run(ExceptionHandlingRunnable.java:57) [launchpad-common-2.0.0.jar!/:2.0.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_65]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_65]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
[2016-01-28 16:13:56] ERROR [pipeline-thread-1] - c.c.l.p.DatabasePipelineRunner: Pipeline '33280be1-0db3-403e-b15f-e59c9b10a1ca' failed
at com.cloudera.launchpad.pipeline.ssh.SshJobFailFastWithOutputLogging$$EnhancerBySpringCGLIB$$6f647027
at com.cloudera.launchpad.bootstrap.InstallPackages.InstallOrUpgradePackage:1
[2016-01-28 16:13:56] INFO [pipeline-thread-1] - c.c.l.p.s.PipelineRepositoryService: Pipeline '33280be1-0db3-403e-b15f-e59c9b10a1ca': RUNNING -> SUSPENDED
[2016-01-28 16:13:56] INFO [pipeline-thread-1] - c.c.l.d.DeploymentRepositoryService: Deployment 'manager': BOOTSTRAPPING -> BOOTSTRAP_FAILED
... View more
Labels:
- Labels:
-
Cloudera Manager