Support Questions

Find answers, ask questions, and share your expertise

Data science workbench docker has no internet access

avatar
Contributor

I installed the Data Science Workbench on a gateway node, and it seems that it's all up and running, without any errors. However, for some reason the docker containers do not have access to the internet, so I can't install any packages in them, etc. The exact error message is:

Step 2/12 : RUN apt-get update -y
 ---> Running in 435f1addc906
Err:1 http://security.debian.org testing/updates InRelease
  Temporary failure resolving 'security.debian.org'
Err:2 http://deb.debian.org/debian testing InRelease
  Temporary failure resolving 'deb.debian.org'
Err:3 http://http.debian.net/debian sid InRelease
  Temporary failure resolving 'http.debian.net'
Err:4 http://deb.debian.org/debian testing-updates InRelease
  Temporary failure resolving 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/testing/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/testing-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/dists/testing/updates/InRelease  Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://http.debian.net/debian/dists/sid/InRelease  Temporary failure resolving 'http.debian.net'
W: Some index files failed to download. They have been ignored, or old ones used instead.

The output of cdsw status:

Cloudera Data Science Workbench Status

Service Status
docker: active
kubelet: active
nfs: active
Checking kernel parameters...

Node Status
NAME                                        STATUS    AGE       STATEFUL
ip-xx.eu-west-1.compute.internal   Ready     15d       true

System Pod status
NAME                                                                READY     STATUS    RESTARTS   AGE
dummy-2088944543-pfazy                                              1/1       Running   0          15d
etcd-ip-xx.eu-west-1.compute.internal                      1/1       Running   0          15d
kube-apiserver-ip-xx.eu-west-1.compute.internal            1/1       Running   0          15d
kube-controller-manager-ip-xx.eu-west-1.compute.internal   1/1       Running   0          15d
kube-discovery-1150918428-50nmx                                     1/1       Running   0          15d
kube-dns-3873593988-gg6s2                                           3/3       Running   0          15d
kube-proxy-0j15p                                                    1/1       Running   0          15d
kube-scheduler-ip-xx.eu-west-1.compute.internal            1/1       Running   0          15d
node-problem-detector-v0.1-ktr13                                    1/1       Running   0          15d
weave-net-r8j2g                                                     2/2       Running   0          15d

Cloudera Data Science Workbench Pod Status
NAME                                  READY     STATUS      RESTARTS   AGE       ROLE
cron-3971587342-ddoca                 1/1       Running     0          15d       cron
db-4066525870-qchwg                   1/1       Running     0          15d       db
db-migrate-abec968-oxxek              0/1       Completed   0          15d       db-migrate
dhqrwn5eobowq3ea                      0/2       Pending     0          4d        console
engine-deps-ufifx                     1/1       Running     0          15d       engine-deps
ingress-controller-2976678207-g88f5   1/1       Running     0          15d       ingress-controller
livelog-2494298876-chy37              1/1       Running     0          15d       livelog
reconciler-577027981-slrwk            1/1       Running     0          15d       reconciler
spark-port-forwarder-7ixp4            1/1       Running     0          15d       spark-port-forwarder
web-1304125449-2of76                  1/1       Running     2          15d       web
web-1304125449-q3rbd                  1/1       Running     0          15d       web
web-1304125449-vydxd                  1/1       Running     1          15d       web

What do I need to change to have internet access inside the docker containers?

 

Thanks!

1 ACCEPTED SOLUTION

avatar
Expert Contributor

For real installations, you should pull from a repository.  This will ensure all the nodes in your CDSW cluster have access to the image, not just the node where you built it.  Moreover, you should not assume that your Docker image store is persistent across upgrades or in long-running clusters where we may evict less used images to free space. By pushing your custom images to a repository, you will ensure that images are never deleted due to image eviction policies or other administration tasks.

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

The Docker daemon within Cloudera Data Science Workbench runs with --iptables=false option.  This means that you need to build with docker build --net=host if you need internet connectivity.  

 

Note that Cloudera does not support or recommend using the internal Docker for builds or third-party use cases. Doing so will break how Cloudera Data Science Workbench allocates CPU and memory resources to jobs and sessions.

avatar
Contributor

Thanks for the answer!

Two questions: it seems that docker build does not have the --net option, only docker run. What can I do to include this setting in the build? What's the supported way of adding/changing a docker image for CDSW? Should I pull it from a repo?

 

avatar
Expert Contributor

For real installations, you should pull from a repository.  This will ensure all the nodes in your CDSW cluster have access to the image, not just the node where you built it.  Moreover, you should not assume that your Docker image store is persistent across upgrades or in long-running clusters where we may evict less used images to free space. By pushing your custom images to a repository, you will ensure that images are never deleted due to image eviction policies or other administration tasks.