Reply
New Contributor
Posts: 6
Registered: ‎09-23-2016

clusterdock and AWS

Hi,

 

Im' trying to run an hadoop cluster on a m4.xlarge (4 vCPU, 16GiB) running Ubuntu 14.04.

 

I'm using CDH 5.8 for docker...

when i run the command mentioned in the documentation:

 

clusterdock_run ./bin/start_cluster -n testing cdh --primary-node=node-1 --secondary-nodes='node-{2..4}'

 

I get:

 

INFO:clusterdock.topologies.cdh.actions:Waiting for Cloudera Manager server to come online...
Traceback (most recent call last):
File "./bin/start_cluster", line 70, in <module>
main()
File "./bin/start_cluster", line 63, in main
actions.start(args)
File "/root/clusterdock/clusterdock/topologies/cdh/actions.py", line 108, in start
CM_SERVER_PORT, timeout_sec=180)
File "/root/clusterdock/clusterdock/utils.py", line 52, in wait_for_port_open
timeout_sec, address, port
Exception: Timed out after 180 seconds waiting for 192.168.124.2:7180 to be open.

 

Any idea ?

 

Thank you !

Cloudera Employee
Posts: 54
Registered: ‎07-19-2016

Re: clusterdock and AWS

What version of Docker are you running? And can you run free -g to see how much RAM is available before you try starting the cluster?
New Contributor
Posts: 6
Registered: ‎09-23-2016

Re: clusterdock and AWS

right, the problem was a lack of memory... 

I've tried using 32GiB ec2 instance, and it's ok...

 

Is there a way to export other ports on the master ? 

as we used clusterdock to start primage docker image, it's not possible to pass -P option to the 'docker run' that is executed...

 

 

 

Cloudera Employee
Posts: 54
Registered: ‎07-19-2016

Re: clusterdock and AWS

Which other ports are you looking to expose to the host?
New Contributor
Posts: 6
Registered: ‎09-23-2016

Re: clusterdock and AWS

8020 (hdfs namenode)

Cloudera Employee
Posts: 54
Registered: ‎07-19-2016

Re: clusterdock and AWS

Ah, no exposing individual ports like that isn't possible.
New Contributor
Posts: 6
Registered: ‎09-23-2016

Re: clusterdock and AWS

ok, is there a way using cloudera manager to see on wich datanodes hdfs blocs are created for one file, and nodes where these blocks are replicated ? 

 

Such a thing is possible connecting on webdfs (port 50070)... and can be very interesting sometimes....

 

Cloudera Employee
Posts: 54
Registered: ‎07-19-2016

Re: clusterdock and AWS

[ Edited ]

You can SSH into any node of the cluster on the machine with the clusterdock_ssh command described in the docs. Once you're there, any of the normal hdfs utilities that give you that kind of information can be run from the command line.

New Contributor
Posts: 6
Registered: ‎09-23-2016

Re: clusterdock and AWS

I've successed to connect  to the primary once, and I get the following message now:

 

clusterdock_ssh f864b8b5fb1f
Error response from daemon: Container f53fe5e1635d0b4370f17e914e22234b2b15dc0a468b8a4e74d3e92aa562662a is not running

 

and of course, the primary container is UP

 

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8f6be65326cc docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" About an hour ago Up About an hour awesome_banach
30d5d9d844a8 docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" About an hour ago Up About an hour tender_murdock
5503b34c9b99 docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" About an hour ago Up About an hour jolly_heisenberg
f864b8b5fb1f docker.io/cloudera/clusterdock:cdh580_cm581_primary-node "/sbin/init" About an hour ago Up About an hour 0.0.0.0:32779->7180/tcp, 0.0.0.0:32778->8888/tcp big_lichterman

Cloudera Employee
Posts: 54
Registered: ‎07-19-2016

Re: clusterdock and AWS

Use the fully qualified domain name of the container, which is visible during startup (e.g. node-1.cluster), as the sole argument to clusterdock_ssh.
Announcements