Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I followed your instructions and am stuck. The images.sh code is hung at the following line #65. The reason is that it is prompting for a password.... Did I miss any step(s)? How do you suggest to proceed from here?

ssh -o ConnectTimeout=4 -o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -q root@$DH mkdir -p /opt/docker_cluster/ambari-server-$AMBARI_VERSION

Regards @rmaruthiyodan

Joginder

11 REPLIES 11

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I bypassed the password error by setting up passwordless ssh. I am able to create the images but have trouble creating the cluster. Attached code contains the command and the error, which is generated by uncommening set -x in your create_cluster.sh. Please advise on how to fix the two errors. Regards @rmaruthiyodan

Cannot connect to the Docker daemon. Is the docker daemon running on this host?
and
 An instance already exists in this cluster, with the name: ' js5wrk1.js.local ' Please use unique hostnames...
[root@js5mgr ~]# /home/js/docker-hdp-lab/create_cluster.sh cluster.props
+ '[' 1 -ne 1 ']'
+ '[' '!' -f cluster.props ']'
+ CLUSTER_PROPERTIES=cluster.props
+ source cluster.props
+ '[' '!' root ']'
+ '[' '!' js5 ']'
+ '[' '!' 2.4.1.0 ']'
+ '[' '!' 2.4.1.0 ']'
+ '[' '!' 4 ']'
+ '[' '!' js.local ']'
++ grep 'HOST[0-9]*=' cluster.props
++ wc -l
+ '[' 4 -ne 4 ']'
++ wc -l
++ grep 'HOST[0-9]*_SERVICE' cluster.props
+ '[' 4 -ne 4 ']'
+ source /etc/docker-hdp-lab.conf
++ SWARM_MANAGER=js5mgr.js.local
++ DEFAULT_DOMAIN_NAME=js.local
++ LOCAL_REPO_NODE=js5repo.js.local
++ OVERLAY_NETWORK=10.0.5.0/24
+++ hostname -i
+++ awk '{print $1}'
++ LOCAL_IP=192.168.11.161
++ NUM_OF_DOCKER_HOSTS=5
++ DOCKER_HOST1=js5mgr.js.local
++ DOCKER_HOST2=js5repo.js.local
++ DOCKER_HOST3=js5wrk1.js.local
++ DOCKER_HOST3=js5wrk2.js.local
++ DOCKER_HOST3=js5wrk3.js.local
++ CLEAN_UP_EXCEPTION_FILE=/opt/maggie/daily_exception_list_for_stop
+ '[' root '!=' root ']'
+ export ssh_cmd=/bin/ssh
+ ssh_cmd=/bin/ssh
+ export tee_cmd=tee
+ tee_cmd=tee
+ __resource_check
+ (( i=1 ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST1}'
++ dh=js5mgr.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5mgr.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
+ free_memory=4
+ '[' 4 -lt 4 ']'
+ (( i++  ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST2}'
++ dh=js5repo.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5repo.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
+ free_memory=4
+ '[' 4 -lt 4 ']'
+ (( i++  ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST3}'
++ dh=js5wrk3.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5wrk3.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
/home/js/docker-hdp-lab/create_cluster.sh: line 31: ( + )/1024/1024 : syntax error: operand expected (error token is ")/1024/1024 ")
+ __validate_hostnames
+ (( i=1 ))
+ (( i<=4 ))
+ eval 'NODENAME=${HOST1}'
++ NODENAME=js5wrk1
++ docker -H js5mgr.js.local:4000 ps -a
++ grep js5wrk1
++ awk -F / '{print $NF}'
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
+ existing_node=
+ '[' '!' -z '' ']'
+ NODENAME=js5wrk1.js.local
++ getent hosts js5wrk1.js.local
+ IP='192.168.11.152  js5wrk1.js.local'
+ '[' 0 -eq 0 ']'
++ tput setaf 1
++ tput sgr 0
+ echo -e '\t  An instance already exists in this cluster, with the name: '\''' js5wrk1.js.local ''\'' Please use unique hostnames...'
          An instance already exists in this cluster, with the name: ' js5wrk1.js.local ' Please use unique hostnames...
+ exit


Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

Cloudera Employee

@Joginder Sethi thanks for attaching the output from the script.

For the error -

Cannot connect to the Docker daemon.Is the docker daemon running on this host?

- Check whether the docker deamon is really running on the RHEL/Centos7 Server : systemctl status docker and systemctl status docker-hdp-lab

- Start both the services if they don't run already.

And for -

An instance already exists in this cluster,with the name:' js5wrk1.js.local 'Please use unique hostnames...

The output indicates that the hostname defined inside cluster.props is already resolvable. And you must use some new/unique names for the hostnames to be created.

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I bypassed the password error by setting up passwordless ssh. I am able to create the images but have trouble creating the cluster. Attached code contains the command and the error, which is generated by uncommening set -x in your create_cluster.sh. Please advise on how to fix the two errors. Regards @rmaruthiyodan

  1. Cannot connect to the Docker daemon.Is the docker daemon running on this host?and
  2. An instance already exists inthis cluster,with the name:' js5wrk1.js.local 'Pleaseuse unique hostnames...
[root@js5mgr ~]# /home/js/docker-hdp-lab/create_cluster.sh cluster.props
+ '[' 1 -ne 1 ']'
+ '[' '!' -f cluster.props ']'
+ CLUSTER_PROPERTIES=cluster.props
+ source cluster.props
+ '[' '!' root ']'
+ '[' '!' js5 ']'
+ '[' '!' 2.4.1.0 ']'
+ '[' '!' 2.4.1.0 ']'
+ '[' '!' 4 ']'
+ '[' '!' js.local ']'
++ grep 'HOST[0-9]*=' cluster.props
++ wc -l
+ '[' 4 -ne 4 ']'
++ wc -l
++ grep 'HOST[0-9]*_SERVICE' cluster.props
+ '[' 4 -ne 4 ']'
+ source /etc/docker-hdp-lab.conf
++ SWARM_MANAGER=js5mgr.js.local
++ DEFAULT_DOMAIN_NAME=js.local
++ LOCAL_REPO_NODE=js5repo.js.local
++ OVERLAY_NETWORK=10.0.5.0/24
+++ hostname -i
+++ awk '{print $1}'
++ LOCAL_IP=192.168.11.161
++ NUM_OF_DOCKER_HOSTS=5
++ DOCKER_HOST1=js5mgr.js.local
++ DOCKER_HOST2=js5repo.js.local
++ DOCKER_HOST3=js5wrk1.js.local
++ DOCKER_HOST3=js5wrk2.js.local
++ DOCKER_HOST3=js5wrk3.js.local
++ CLEAN_UP_EXCEPTION_FILE=/opt/maggie/daily_exception_list_for_stop
+ '[' root '!=' root ']'
+ export ssh_cmd=/bin/ssh
+ ssh_cmd=/bin/ssh
+ export tee_cmd=tee
+ tee_cmd=tee
+ __resource_check
+ (( i=1 ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST1}'
++ dh=js5mgr.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5mgr.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
+ free_memory=4
+ '[' 4 -lt 4 ']'
+ (( i++  ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST2}'
++ dh=js5repo.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5repo.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
+ free_memory=4
+ '[' 4 -lt 4 ']'
+ (( i++  ))
+ (( i<=5 ))
+ eval 'dh=${DOCKER_HOST3}'
++ dh=js5wrk3.js.local
+ ssh_options='-o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=5'
+ read free cache
++ /bin/ssh js5wrk3.js.local 'cat /proc/meminfo'
++ egrep 'MemFree|Cached'
++ head -n2
++ awk '{print $2}'
+ '[' 0 -ne 0 ']'
/home/js/docker-hdp-lab/create_cluster.sh: line 31: ( + )/1024/1024 : syntax error: operand expected (error token is ")/1024/1024 ")
+ __validate_hostnames
+ (( i=1 ))
+ (( i<=4 ))
+ eval 'NODENAME=${HOST1}'
++ NODENAME=js5wrk1
++ docker -H js5mgr.js.local:4000 ps -a
++ grep js5wrk1
++ awk -F / '{print $NF}'
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
+ existing_node=
+ '[' '!' -z '' ']'
+ NODENAME=js5wrk1.js.local
++ getent hosts js5wrk1.js.local
+ IP='192.168.11.152  js5wrk1.js.local'
+ '[' 0 -eq 0 ']'
++ tput setaf 1
++ tput sgr 0
+ echo -e '\t  An instance already exists in this cluster, with the name: '\''' js5wrk1.js.local ''\'' Please use unique hostnames...'
          An instance already exists in this cluster, with the name: ' js5wrk1.js.local ' Please use unique hostnames...
+ exit


Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

@rmaruthiyodan

Thank you for the answer. I redid all of it again and that error went away.

As you can see overlay network js.local (please see the botom of the attached code) is inaccesible, and my two cents are that it is due to the network docker_gwbridge has a false option for com.docker.network.bridge.enable_icc.

Symptoms: The current error where I am stuck at the following error create_cluster.sh (line 136 in function __populate_hosts_file()):

+ __populate_hosts_file
++ awk '{print $1}' root-js5-tmphostfile
+ for ip in '$(awk '\''{print $1}'\'' $USERNAME-$CLUSTERNAME-tmphostfile)'
+ /bin/ssh -o CheckHostIP=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@10.0.5.3 'cat >> /etc/hosts'
+ cat root-js5-tmphostfile
++ grep 10.0.5.3 root-js5-tmphostfile
++ awk '{print $2}'
+ echo 'Initialization of ' js5-ambari-server.js.local ' is taking some time to complete. Waiting for another 5s...'
Initialization of  js5-ambari-server.js.local  is taking some time to complete. Waiting for another 5s...
+ sleep 5
It never comes out the above loop.  Further debugging showed that the overlay network is NOT reachable.  Here is an output (with commands) of different debugging/verification steps:
[root@js5mgr ~]# ping 10.0.5.3
PING 10.0.5.3 (10.0.5.3) 56(84) bytes of data.
^C
--- 10.0.5.3 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4000ms


[root@js5mgr ~]#
[root@js5mgr ~]#  cat root-js5-tmphostfile
10.0.5.3 js5-ambari-server.js.local
10.0.5.4 js5hdpwrk1.js.local js5hdpwrk1
10.0.5.5 jshdp5wrk2.js.local jshdp5wrk2
10.0.5.6 js5hdpmgr.js.local js5hdpmgr
10.0.5.7 js5hdprepo.js.local js5hdprepo
[root@js5mgr ~]# ssh -v root@10.0.5.3
OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 60: Applying options for *
debug1: Executing proxy command: exec /usr/bin/sss_ssh_knownhostsproxy -p 22 10.0.5.3
debug1: permanently_set_uid: 0/0
debug1: identity file /root/.ssh/id_rsa type 1
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1
debug1: permanently_drop_suid: 0
========================================
[root@js5mgr ~]# docker inspect overlay-gatewaynode
[
    {
        "Id": "cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66",
        "Created": "2017-01-24T03:53:38.690165477Z",
        "Path": "/start",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 3354,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2017-02-06T16:45:19.335210744Z",
            "FinishedAt": "2017-02-06T16:44:06.653213759Z"
        },
        "Image": "sha256:48699c480e7bae13c5dd53d904d4eeddc37f0593ad045e3b30d94bdfdf054ee2",
        "ResolvConfPath": "/var/lib/docker/containers/cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66/hostname",
        "HostsPath": "/var/lib/docker/containers/cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66/hosts",
        "LogPath": "/var/lib/docker/containers/cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66/cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66-json.log",
        "Name": "/overlay-gatewaynode",
        "RestartCount": 0,
        "Driver": "devicemapper",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "js.local",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": true,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "label=disable"
            ],
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "devicemapper",
            "Data": {
                "DeviceId": "129",
                "DeviceName": "docker-253:0-67194611-aba1f23cfbd4532ce140527da3f8c4d70c54289ff7cc22d1bd764bdcefddf4ff",
                "DeviceSize": "10737418240"
            }
        },
        "Mounts": [],
        "Config": {
            "Hostname": "overlay-gatewaynode",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "/start"
            ],
            "ArgsEscaped": true,
            "Image": "gatewaynode",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "build-date": "20161102",
                "license": "GPLv2",
                "name": "CentOS Base Image",
                "vendor": "CentOS"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "ba2e68ed520e13be5b655c67b18dae22c629e305494857627bdbb9a23470290b",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/ba2e68ed520e",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "js.local": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "overlay-gatewaynode",
                        "cdf29649e9d3"
                    ],
                    "NetworkID": "cd0190db98dac536de356a5327d00e8156d6abc05292c98985e480dff53543f9",
                    "EndpointID": "1f8b4b17281b16a819abbc158c9d0f64398f52e4be539298c7ce24d294edf01a",
                    "Gateway": "",
                    "IPAddress": "10.0.5.2",
                    "IPPrefixLen": 24,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:0a:00:05:02"
                }
            }
        }
    }
]
----
[root@js5mgr ~]# docker ps -a
CONTAINER ID        IMAGE                       COMMAND                  CREATED             STATUS              PORTS                                                                            NAMES
75b42231689d        hdp/ambari-agent-2.4.1.0    "/start"                 13 days ago         Up 13 minutes                                                                                        root-js5-js5hdprepo
d654d693ef54        hdp/ambari-agent-2.4.1.0    "/start"                 13 days ago         Up 12 minutes                                                                                        root-js5-js5hdpmgr
2ec49ae9dea1        hdp/ambari-agent-2.4.1.0    "/start"                 13 days ago         Up 12 minutes                                                                                        root-js5-jshdp5wrk2
40635efeeee2        hdp/ambari-agent-2.4.1.0    "/start"                 13 days ago         Up 13 minutes                                                                                        root-js5-js5hdpwrk1
5f570f789f24        hdp/ambari-server-2.4.1.0   "/start"                 13 days ago         Up 15 minutes                                                                                        root-js5-ambari-server
cdf29649e9d3        gatewaynode                 "/start"                 2 weeks ago         Up 12 hours                                                                                          overlay-gatewaynode
c2a8d44a46f0        swarm                       "/swarm join --adv..."   2 weeks ago         Up 12 hours         2375/tcp                                                                         swarm_join
3267974d1cfe        swarm                       "/swarm manage -H ..."   2 weeks ago         Up 12 hours         2375/tcp, 0.0.0.0:4000->4000/tcp                                                 swarm_manager
b9457eda9418        progrium/consul             "/bin/start -serve..."   2 weeks ago         Up 12 hours         53/tcp, 53/udp, 8300-8302/tcp, 8400/tcp, 8301-8302/udp, 0.0.0.0:8500->8500/tcp   consul
===================
[root@js5mgr ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
4c99debeaa23        bridge              bridge              local
5353566f4ca4        docker_gwbridge     bridge              local
9ec27b3a4eb4        host                host                local
cd0190db98da        js.local            overlay             global
bde7c3ff5154        none                null                local


[root@js5mgr ~]# docker inspect js.local
[
    {
        "Name": "js.local",
        "Id": "cd0190db98dac536de356a5327d00e8156d6abc05292c98985e480dff53543f9",
        "Created": "2017-01-23T22:53:33.512955644-05:00",
        "Scope": "global",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "10.0.5.0/24"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "2ec49ae9dea1746ac538426723f6ce146a1122a6b9f76722e4726f9927835d3e": {
                "Name": "root-js5-jshdp5wrk2",
                "EndpointID": "3b4aa685b976b43854bb141c45b2440d6a4e4d194e1bf2e8583f567284141931",
                "MacAddress": "02:42:0a:00:05:05",
                "IPv4Address": "10.0.5.5/24",
                "IPv6Address": ""
            },
            "40635efeeee2a42d97bf6a838520ed7951142ca097313b7d4ccba3611d4f0e27": {
                "Name": "root-js5-js5hdpwrk1",
                "EndpointID": "d7c5120db08f7baeeeb68c89c189123779846fbd3aa02ada350e257fff4003bb",
                "MacAddress": "02:42:0a:00:05:04",
                "IPv4Address": "10.0.5.4/24",
                "IPv6Address": ""
            },
            "5f570f789f249f51b58470d94ccf2368da13ca686f13d37fec9c77594a1e1943": {
                "Name": "root-js5-ambari-server",
                "EndpointID": "2389a438be6d4b5dd354ae604be6fabfa5da03675a47508e6fbd715015122a73",
                "MacAddress": "02:42:0a:00:05:03",
                "IPv4Address": "10.0.5.3/24",
                "IPv6Address": ""
            },
            "75b42231689d665994bb13682450d972649adade31c20109c7782f963732741f": {
                "Name": "root-js5-js5hdprepo",
                "EndpointID": "591415548f8d7254270e3646e54b62d83775fe6268f2c40a010a4aab1db0d62e",
                "MacAddress": "02:42:0a:00:05:07",
                "IPv4Address": "10.0.5.7/24",
                "IPv6Address": ""
            },
            "cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66": {
                "Name": "overlay-gatewaynode",
                "EndpointID": "1f8b4b17281b16a819abbc158c9d0f64398f52e4be539298c7ce24d294edf01a",
                "MacAddress": "02:42:0a:00:05:02",
                "IPv4Address": "10.0.5.2/24",
                "IPv6Address": ""
            },
            "d654d693ef54251e37d33282c6b2eb11872225008f813c50d5b75978b25a0810": {
                "Name": "root-js5-js5hdpmgr",
                "EndpointID": "c436d1a34cb24d2dd7db0058d7d0b6569eda7701c2cb0ffffdc6cd6f1bb93c51",
                "MacAddress": "02:42:0a:00:05:06",
                "IPv4Address": "10.0.5.6/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]
[root@js5mgr ~]# docker inspect  docker_gwbridge
[
    {
        "Name": "docker_gwbridge",
        "Id": "5353566f4ca492a3f8070e53b245c24c9997c8645c7e1b446186c7ec96255f3b",
        "Created": "2017-01-20T23:17:21.613547615-05:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "2ec49ae9dea1746ac538426723f6ce146a1122a6b9f76722e4726f9927835d3e": {
                "Name": "gateway_2ec49ae9dea1",
                "EndpointID": "e8533ec76021456081b9941cc1374e721900a3c436d71e6ee85020e3cba0d8ef",
                "MacAddress": "02:42:ac:12:00:07",
                "IPv4Address": "172.18.0.7/16",
                "IPv6Address": ""
            },
            "40635efeeee2a42d97bf6a838520ed7951142ca097313b7d4ccba3611d4f0e27": {
                "Name": "gateway_40635efeeee2",
                "EndpointID": "e0dd7edc2fdca749166b014e8228f70c54cf9f8e8ea97df7898c6195f7099e99",
                "MacAddress": "02:42:ac:12:00:04",
                "IPv4Address": "172.18.0.4/16",
                "IPv6Address": ""
            },
            "5f570f789f249f51b58470d94ccf2368da13ca686f13d37fec9c77594a1e1943": {
                "Name": "gateway_5f570f789f24",
                "EndpointID": "e712a6016c8667864b0c7f3b8acbb9c6e019776a2f6d3127de25ff25dff1e54e",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": ""
            },
            "75b42231689d665994bb13682450d972649adade31c20109c7782f963732741f": {
                "Name": "gateway_75b42231689d",
                "EndpointID": "1dca374cb6082d2ba904fbc14d303e47d667b85e43716086324056c411408fe3",
                "MacAddress": "02:42:ac:12:00:05",
                "IPv4Address": "172.18.0.5/16",
                "IPv6Address": ""
            },
            "cdf29649e9d3890c6194c3d0bf1aa33ba3984388a411f515ba554fa6ce60ac66": {
                "Name": "gateway_cdf29649e9d3",
                "EndpointID": "6d329d02f9c5651828a8739e046633277d4961102ff781a5bbfb482876916171",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": ""
            },
            "d654d693ef54251e37d33282c6b2eb11872225008f813c50d5b75978b25a0810": {
                "Name": "gateway_d654d693ef54",
                "EndpointID": "8aeb928b6c33d1825e454b5f577193d083adc72fef7de5f717e63ddd0c793a29",
                "MacAddress": "02:42:ac:12:00:06",
                "IPv4Address": "172.18.0.6/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.enable_icc": "false",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.name": "docker_gwbridge"
        },
        "Labels": {}
    }
]

As you can see overlay network js.local is inaccesible, and my two cents are that it is due to the network docker_gwbridge has a false option for com.docker.network.bridge.enable_icc.

Please advise.





Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

Cloudera Employee

As I find, the option - "com.docker.network.bridge.enable_icc": "false" , seems to be the default option for 'docker_gwbridge' when we use overlay networks.

In this cluster setup, the overlay network is made reachable by routing traffic to overlay-gateway node's IP on bridgeNW. Please check the following steps:

# source /etc/docker-hdp-lab.conf
# route del -net $OVERLAY_NETWORK
# route add -net $OVERLAY_NETWORK gw 172.18.0.2
# ping 10.0.5.2
The following command from 'docker-hdp-lab' service startup should have ideally added this route too:
route add -net $OVERLAY_NETWORK gw $(docker -H $SWARM_MANAGER:4000 exec overlay-gatewaynode hostname -i | awk '{print $2}')

- If the ping works at this point, add a route entry on external systems in your network to use the Docker Host's IP to reach HDP cluster nodes directly; route add -net 10.0.5.0 netmask 255.255.255.0 <Docker_HOST IP>

Also, it looks like you have a single Docker Host in this environment. And if so, please check that all the roles defined inside /etc/docker-hdp-lab.conf such as - SWARM_MANAGER and LOCAL_REPO_NODE points to the same hostname.

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I have 4 docker engines on separate virtual machines on virtualbox:jsmgr,jsr5rep,js5wrk1, &js5wrk2. The listing i provided was only from js5mgr.

I am away from computer but will try when I am near my computer which may take 2 hrs.

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

question3attachment.txt@rmaruthiyodan

On js5mgr after "route add -net $OVERLAY_NETWORK gw 172.18.0.2" I am able to ping. It appears the command in the service docker-hdp-lab was not able to add the overlay_network:

[root@js5mgr ~]# service docker-hdp-lab status
Redirecting to /bin/systemctl status  docker-hdp-lab.service
● docker-hdp-lab.service - Docker HDP Lab Cluster
   Loaded: loaded (/etc/systemd/system/docker-hdp-lab.service; enabled; vendor preset: disabled)
   Active: active (exited) since Tue 2017-02-07 20:40:10 EST; 39min ago
  Process: 3208 ExecStart=/opt/docker_cluster/docker-hdp-lab_service.sh start (code=exited, status=0/SUCCESS)
 Main PID: 3208 (code=exited, status=0/SUCCESS)
   Memory: 0B
   CGroup: /system.slice/docker-hdp-lab.service
Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: Error response from daemon: No such container overlay-gatewaynode Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: route add -net 10.0.5.0/24 gw Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: Error response from daemon: No such container overlay-gatewaynode Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: Usage: inet_route [-vF] del {-host|-net} Target[/prefix] [gw Gw] [metric M] [[dev] If] Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: inet_route [-vF] add {-host|-net} Target[/prefix] [gw Gw] [metric M] Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: [netmask N] [mss Mss] [window W] [irtt I] Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: [mod] [dyn] [reinstate] [[dev] If] Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: inet_route [-vF] add {-host|-net} Target[/prefix] [metric M] reject Feb 07 20:40:10 js5mgr.js.local docker-hdp-lab_service.sh[3208]: inet_route [-FC] flush NOT supported Feb 07 20:40:10 js5mgr.js.local systemd[1]: Started Docker HDP Lab Cluster. [root@js5mgr ~]#
It appears "docker -H js5mgr.local.js:4000 doesn't work in docker-hdp-lab" as shown below.  But what is the reason it is not working?

[root@js5mgr ~]# docker -H $SWARM_MANAGER:4000 exec overlay-gatewaynode hostname -i
Error response from daemon: No such container overlay-gatewaynode
[root@js5mgr ~]# echo $SWARM_MANAGER
js5mgr.js.local
[root@js5mgr ~]# docker -H js5mgr.js.local:4000 exec overlay-gatewaynode hostname -i
Error response from daemon: No such container overlay-gatewaynode
[root@js5mgr ~]# docker -H js5mgr:4000 exec overlay-gatewaynode hostname -i
Error response from daemon: No such container overlay-gatewaynode
[root@js5mgr ~]# docker -H localhost:4000 exec overlay-gatewaynode hostname -i
Error response from daemon: No such container overlay-gatewaynode
[root@js5mgr ~]# docker  exec overlay-gatewaynode hostname -i
10.0.5.2


Here are files docker-hdp-lab.service & /etc/docker-hdp-lab.conf.
cat /etc/systemd/system/docker-hdp-lab.service
[Unit]
Description=Docker HDP Lab Cluster
After=docker.service

[Service]
Type=notify
ExecStart=/opt/docker_cluster/docker-hdp-lab_service.sh start
ExecStop=/opt/docker_cluster/docker-hdp-lab_service.sh stop
RemainAfterExit=True


[Install]
WantedBy=multi-user.target
[root@js5mgr ~]# cat /etc/docker-hdp-lab.conf
# Designate the role of different Docker Host machines using the properties below:
SWARM_MANAGER=js5mgr.js.local
DEFAULT_DOMAIN_NAME="js.local"
LOCAL_REPO_NODE="js5repo.js.local"
OVERLAY_NETWORK="10.0.5.0/24"
LOCAL_IP=$(hostname -i | awk '{print $1}')
# Replace the value of LOCAL_IP with this host's IP that will be used to communiate with the other nodes in the Docker Swarm Cluster


NUM_OF_DOCKER_HOSTS=4
DOCKER_HOST1="js5wrk1.js.local"
DOCKER_HOST2="js5wrk2.js.local"
DOCKER_HOST3="js5mgr.js.local"
DOCKER_HOST4="js5repo.js.local"
# For multi-node Docker swarm cluster, set NUM_OF_DOCKER_HOSTS=[0-9]
# & Add DOCKER_HOST[2-9] variables for every Docker Host node. Such as DOCKER_HOST2="altair"


CLEAN_UP_EXCEPTION_FILE="/opt/maggie/daily_exception_list_for_stop"
# The exception file for daily cleanup script "__daily_stop_cluster.sh" . And the file gets updated using "keep_it_running.sh" script



Questions are:
1) Do I need to setup gw on all the virtual machines?
2) Why are we using 172.18.0.2 over any other ip defined in that subnet?
3) I stopped docker-hdp-lab service and then started..it seems most of containers are exiting with code 137.  I remember the last time I had to start all the containers manually.  Please see attached file that contains the details"

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I did the following for manual start:

[root@js5mgr ~]# docker start $(docker ps -a | grep Exited | awk '{print $1}') 75b42231689d d654d693ef54 2ec49ae9dea1 40635efeeee2 5f570f789f24

Re: Question on create_image.sh "A Multi-node Docker Cluster platform to quickly spin up HDP"

New Contributor

I did the following for manual start:

[root@js5mgr ~]# docker start $(docker ps -a | grep Exited | awk '{print $1}') 75b42231689d d654d693ef54 2ec49ae9dea1 40635efeeee2 5f570f789f24