Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ERROR: ClusterDock installation

Re: ERROR: ClusterDock installation

New Contributor

Hey Dima,

 

I simply overlooked the hostname:port initially - probably because of the error messages I saw.

 

The startup issues are reproducible and (so far) always affect Impala and randomly other services as well. But to be more precise: it's health issues (and some configuration issues mainly recommending more nodes and higher memory settings). Sometimes, restarting services fixes the health issues. But in the case of Impala (CM complains about the Query Monitoring Status) a restart does not help. But that's okay for me at the moment because I can play a bit with Cloudera Manager anyhow.

 

It seems that the problem might be connected with ZooKeeper (it's network config?) as the respective error logs mention ZooKeeper timeout etc. in the error/fatal lines.

 

Concerning Hue port redirection, I'd would highly appreciate if you can include it or give me a hint where I can add it myself (in the Dockerfile?)

 

Regards 

Re: ERROR: ClusterDock installation

Rising Star
The Impala issue (and the config errors you're seeing) sound like you're starved for RAM; even if all 16 GB are available to Docker, if it's not actually free, your cluster just won't have enough memory available. The ZooKeeper errors, if I had to guess, are just a red herring.

The Hue port change is defined in the CDH cluster topology. It involves a few Python code changes so I'll take that on.

Re: ERROR: ClusterDock installation

New Contributor

I ran into similar issue when I tried to spin up CDH5 cluster (OS: mac). Below is the trace

➜  ~ clusterdock_run ./bin/start_cluster cdh
INFO:clusterdock.cluster:Successfully started node-2.cluster (IP address: 192.168.123.3).
INFO:clusterdock.cluster:Successfully started node-1.cluster (IP address: 192.168.123.2).
INFO:clusterdock.cluster:Started cluster in 8.90 seconds.
INFO:clusterdock.topologies.cdh.actions:Changing server_host to node-1.cluster in /etc/cloudera-scm-agent/config.ini...
INFO:clusterdock.topologies.cdh.actions:Restarting CM agents...
cloudera-scm-agent is already stopped
Starting cloudera-scm-agent: [  OK  ]
Stopping cloudera-scm-agent: [  OK  ]
Starting cloudera-scm-agent: [  OK  ]
INFO:clusterdock.topologies.cdh.actions:Waiting for Cloudera Manager server to come online...
INFO:clusterdock.topologies.cdh.actions:Detected Cloudera Manager server after 57.11 seconds.
INFO:clusterdock.topologies.cdh.actions:CM server is now accessible at http://moby:32771
INFO:clusterdock.topologies.cdh.cm:Detected CM API v13.
INFO:clusterdock.topologies.cdh.cm_utils:Updating database configurations...
INFO:clusterdock.topologies.cdh.cm:Updating NameNode references in Hive metastore...
INFO:clusterdock.topologies.cdh.actions:Once its service starts, Hue server will be accessible at http://moby:32770
INFO:clusterdock.topologies.cdh.actions:Deploying client configuration...
INFO:clusterdock.topologies.cdh.actions:Starting cluster...
Traceback (most recent call last):
  File "./bin/start_cluster", line 70, in <module>
    main()
  File "./bin/start_cluster", line 63, in main
    actions.start(args)
  File "/root/clusterdock/clusterdock/topologies/cdh/actions.py", line 154, in start
    raise Exception('Failed to start cluster.')
Exception: Failed to start cluster.

To spin up the cluster, I downloaded the latest docker dmg from here and followed the clusterdock instructions mentioned here

Re: ERROR: ClusterDock installation

Rising Star
How much memory do you have allocated to your Docker for Mac instance, Praveen?

Re: ERROR: ClusterDock installation

New Contributor

I haven't provided any memory configuration so I assume it's the default configuration, which is unlimited? (as mentioned in the post here)

edit: However, the physical memory on my mac is 8GB

Re: ERROR: ClusterDock installation

Rising Star
By default, Docker for Mac only allocates 1 or 2 GB of system memory to the Linux virtual machine that actually runs Docker containers. Even if you gave it all 8 GB, this wouldn't be enough memory to start up CM/CDH successfully. Your best bet would be to add --dont-start-cluster to the end of your clusterdock_run command and then try starting up one service at a time in Cloudera Manager (though, again, you have too little memory on your machine to have a good experience doing this).

Re: ERROR: ClusterDock installation

New Contributor

Thank you for the quick reply @dspivak. Will try suggestions you posted and comment here shortly.

Re: ERROR: ClusterDock installation

New Contributor

 I had similar problem before with 16G RAM on Cento 7. I got the same error

Traceback (most recent call last):
  File "./bin/start_cluster", line 70, in <module>
    main()
  File "./bin/start_cluster", line 63, in main
    actions.start(args)
  File "/root/clusterdock/clusterdock/topologies/cdh/actions.py", line 154, in start
    raise Exception('Failed to start cluster.')
Exception: Failed to start cluster  

 Later on I bumped up the RAM to 32G.  Now I ran into a different error related to host connections.  I tried cleaning up /etc/hosts, but it did not help.  What could be the cause?

 

root@hostname docker.service.d]# clusterdock_run ./bin/start_cluster cdh

!!! Parallel execution exception under host u'192.168.123.2':

Process 192.168.123.2:

Traceback (most recent call last):

  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

    self.run()

  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run

    self._target(*self._args, **self._kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 242, in inner

    submit(task.run(*args, **kwargs))

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/decorators.py", line 181, in inner

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 171, in __call__

    return self.run(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/root/clusterdock/clusterdock/ssh.py", line 38, in _quiet_task

    return run(command)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 677, in host_prompting_wrapper

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 1088, in run

    shell_escape=shell_escape, capture_buffer_size=capture_buffer_size,

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 928, in _run_command

    channel=default_channel(), command=wrapped_command, pty=pty,

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 418, in default_channel

    chan = _open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 410, in _open_session

    return connections[env.host_string].get_transport().open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 159, in __getitem__

    self.connect(key)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 151, in connect

    user, host, port, cache=self, seek_gateway=seek_gateway)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 603, in connect

    raise NetworkError(msg, e)

NetworkError: Timed out trying to connect to 192.168.123.2 (tried 60 times)

 

Fatal error: One or more hosts failed while executing task '_quiet_task'

 

Underlying exception:

    Timed out trying to connect to 192.168.123.2 (tried 60 times)

 

Aborting.

Exception in thread Thread-1:

Traceback (most recent call last):

  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner

    self.run()

  File "/usr/lib/python2.7/threading.py", line 505, in run

    self.__target(*self.__args, **self.__kwargs)

  File "/root/clusterdock/clusterdock/cluster.py", line 265, in start

    raise Exception("Timed out waiting for {0} to become reachable.".format(self.hostname))

Exception: Timed out waiting for node-1 to become reachable.

 

!!! Parallel execution exception under host u'192.168.123.3':

Process 192.168.123.3:

Traceback (most recent call last):

  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

    self.run()

  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run

    self._target(*self._args, **self._kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 242, in inner

    submit(task.run(*args, **kwargs))

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/decorators.py", line 181, in inner

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 171, in __call__

    return self.run(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/root/clusterdock/clusterdock/ssh.py", line 38, in _quiet_task

    return run(command)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 677, in host_prompting_wrapper

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 1088, in run

    shell_escape=shell_escape, capture_buffer_size=capture_buffer_size,

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 928, in _run_command

    channel=default_channel(), command=wrapped_command, pty=pty,

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 418, in default_channel

    chan = _open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 410, in _open_session

    return connections[env.host_string].get_transport().open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 159, in __getitem__

    self.connect(key)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 151, in connect

    user, host, port, cache=self, seek_gateway=seek_gateway)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 603, in connect

    raise NetworkError(msg, e)

NetworkError: Timed out trying to connect to 192.168.123.3 (tried 60 times)

 

Fatal error: One or more hosts failed while executing task '_quiet_task'

 

Underlying exception:

    Timed out trying to connect to 192.168.123.3 (tried 60 times)

 

Aborting.

Exception in thread Thread-2:

Traceback (most recent call last):

  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner

    self.run()

  File "/usr/lib/python2.7/threading.py", line 505, in run

    self.__target(*self.__args, **self.__kwargs)

  File "/root/clusterdock/clusterdock/cluster.py", line 265, in start

    raise Exception("Timed out waiting for {0} to become reachable.".format(self.hostname))

Exception: Timed out waiting for node-2 to become reachable.

 

INFO:clusterdock.cluster:Started cluster in 61.44 seconds.

!!! Parallel execution exception under host u'192.168.123.2':

!!! Parallel execution exception under host u'192.168.123.3':

Process 192.168.123.2:

Traceback (most recent call last):

  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

    self.run()

  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run

    self._target(*self._args, **self._kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 242, in inner

Process 192.168.123.3:

Traceback (most recent call last):

  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

    submit(task.run(*args, **kwargs))

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/decorators.py", line 181, in inner

    return func(*args, **kwargs)

    self.run()

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 171, in __call__

  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run

    return self.run(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/root/clusterdock/clusterdock/ssh.py", line 45, in _task

    self._target(*self._args, **self._kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 242, in inner

    return run(command)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 677, in host_prompting_wrapper

    submit(task.run(*args, **kwargs))

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/decorators.py", line 181, in inner

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 1088, in run

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 171, in __call__

    return self.run(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/tasks.py", line 174, in run

    return self.wrapped(*args, **kwargs)

  File "/root/clusterdock/clusterdock/ssh.py", line 45, in _task

    return run(command)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 677, in host_prompting_wrapper

    return func(*args, **kwargs)

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 1088, in run

    shell_escape=shell_escape, capture_buffer_size=capture_buffer_size,

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 928, in _run_command

    channel=default_channel(), command=wrapped_command, pty=pty,

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 418, in default_channel

    chan = _open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 410, in _open_session

    return connections[env.host_string].get_transport().open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 159, in __getitem__

    self.connect(key)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 151, in connect

    user, host, port, cache=self, seek_gateway=seek_gateway)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 603, in connect

    raise NetworkError(msg, e)

NetworkError: Timed out trying to connect to 192.168.123.2 (tried 60 times)

    shell_escape=shell_escape, capture_buffer_size=capture_buffer_size,

  File "/usr/local/lib/python2.7/dist-packages/fabric/operations.py", line 928, in _run_command

    channel=default_channel(), command=wrapped_command, pty=pty,

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 418, in default_channel

    chan = _open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/state.py", line 410, in _open_session

    return connections[env.host_string].get_transport().open_session()

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 159, in __getitem__

    self.connect(key)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 151, in connect

    user, host, port, cache=self, seek_gateway=seek_gateway)

  File "/usr/local/lib/python2.7/dist-packages/fabric/network.py", line 603, in connect

    raise NetworkError(msg, e)

NetworkError: Timed out trying to connect to 192.168.123.3 (tried 60 times)

 

Fatal error: One or more hosts failed while executing task '_task'

 

Underlying exception:

    Timed out trying to connect to 192.168.123.2 (tried 60 times)

Re: ERROR: ClusterDock installation

Rising Star
Hi Robert,

What version of Docker are you running? Hard to debug with the information given. Obviously, clusterdock can't SSH into the container nodes, but it's hard to guess why without more information.

Re: ERROR: ClusterDock installation

New Contributor

 I am using 1.12.0.

 

Containers: 10

Running: 2

Paused: 0

Stopped: 8

Images: 4

Server Version: 1.12.0

Storage Driver: devicemapper

Pool Name: docker-253:0-2684354944-pool

Pool Blocksize: 65.54 kB

Base Device Size: 10.74 GB

Backing Filesystem: xfs

Data file: /dev/loop0

Metadata file: /dev/loop1

Data Space Used: 10.2 GB

Data Space Total: 107.4 GB

Data Space Available: 97.17 GB

Metadata Space Used: 9.74 MB

Metadata Space Total: 2.147 GB

Metadata Space Available: 2.138 GB

Thin Pool Minimum Free Space: 10.74 GB

Udev Sync Supported: true

Deferred Removal Enabled: false

Deferred Deletion Enabled: false

Deferred Deleted Device Count: 0

Data loop file: /var/lib/docker/devicemapper/devicemapper/data

WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata