Created on 05-19-2017 03:38 AM - edited 09-16-2022 04:38 AM
I'm trying to install CDSW on an RHEL7.2, but cdsw init does not finish. The final status is:
Cloudera Data Science Workbench Status
Service Status
docker: active
kubelet: active
nfs: active
Checking kernel parameters...
Node Status
NAME STATUS AGE STATEFUL
dn183.pf4h.local Ready 1h <none>
System Pod status
NAME READY STATUS RESTARTS AGE
dummy-2088944543-kw60c 1/1 Running 0 1h
etcd-dn183.pf4h.local 1/1 Running 0 1h
kube-apiserver-dn183.pf4h.local 1/1 Running 0 1h
kube-controller-manager-dn183.pf4h.local 1/1 Running 0 1h
kube-discovery-1150918428-m9kup 1/1 Running 0 1h
kube-dns-654381707-f5yo4 2/3 Running 32 1h
kube-proxy-2zoxp 1/1 Running 0 1h
kube-scheduler-dn183.pf4h.local 1/1 Running 0 1h
weave-net-2s426 2/2 Running 0 1h
Cloudera Data Science Workbench Pod Status
WARNING: Unable to get details for role [cron].
Cloudera Data Science Workbench is not ready yet: the cluster specification is incomplete
Created on 05-23-2017 06:50 AM - edited 05-23-2017 06:51 AM
I found the problem myself. It was a problem with the proxy settings. The proxy was also used for the internal status requests on net 100.66.0.0/24. I added all the addresses to the NO_PROXY list and now the workbench comes up. Hint in the documentation would be nice.
Created 05-19-2017 03:57 AM
Here is the output of the init:
[root@dn183 ~]# cdsw init Using user-specified config file: /etc/cdsw/config/cdsw.conf Prechecking OS Version........[OK] Prechecking scaling limits for processes........[OK] Prechecking scaling limits for open files........[OK] Prechecking that iptables are not configured........[OK] Prechecking that SELinux is disabled........[OK] Prechecking configured block devices and mountpoints........[OK] Prechecking kernel parameters........[OK] Prechecking that docker block devices are of adequate size........[OK] Prechecking that application block devices are of adequate size........[OK] Prechecking size of root volume........[OK] Prechecking that CDH gateway roles are configured........[OK] Prechecking that /etc/krb5 file is not a placeholder........ WARNING: The Kerberos configuration file [/etc/krb5.conf] seems to be a placeholder. If your CDH cluster is Kerberized, please copy /etc/krb5.conf to these Cloudera Data Science Workbench nodes. Press enter to continue. Prechecking parcel paths........[OK] Prechecking CDH client configurations........[OK] Prechecking Java version........[OK] Prechecking Java distribution........[OK] Creating docker thinpool if it does not exist Volume group "docker" not found Cannot process volume group docker Unmounting /dev/sdb umount: /dev/sdb: not mounted Removing Docker volume groups. Volume group "docker" not found Cannot process volume group docker Volume group "docker" not found Cannot process volume group docker Cleaning up docker directories... Physical volume "/dev/sdb" successfully created Volume group "docker" successfully created WARNING: xfs signature detected on /dev/docker/thinpool at offset 0. Wipe it? [y/n]: y Wiping xfs signature on /dev/docker/thinpool. Logical volume "thinpool" created. Logical volume "thinpoolmeta" created. WARNING: Converting logical volume docker/thinpool and docker/thinpoolmeta to pool's data and metadata volumes. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Converted docker/thinpool to thin pool. Logical volume "thinpool" changed. Initialize application storage at /var/lib/cdsw [/dev/sdc] is formatted as [] and not ext4, hence needs to be formatted. WARNING: Formatting will erase all project files and data! Are you sure? y Starting rpc-statd... Enabling rpc-statd... Starting nfs-idmapd... Enabling nfs-idmapd... Starting rpcbind... Enabling rpcbind... Starting nfs-server... Enabling nfs-server... Proceeding to create storage for application... mke2fs 1.42.9 (28-Dec-2013) /dev/sdc is entire device, not just one partition! Proceed anyway? (y,n) y Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 73228288 inodes, 292896768 blocks 14644838 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=2441084928 8939 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Allocating group tables: done Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done Enabling node with IP [192.168.239.183]... Node [192.168.239.183] added to nfs export list successfully. Starting rpc-statd... Enabling rpc-statd... Starting nfs-idmapd... Enabling nfs-idmapd... Starting rpcbind... Enabling rpcbind... Starting nfs-server... Enabling nfs-server... Starting docker... Enabling docker... Starting ntpd... Enabling ntpd... Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service. Initializing cluster... Running pre-flight checks <master/tokens> generated token: "afde63.f3bb72d9a16f9a74" <master/pki> generated Certificate Authority key and certificate: Issuer: CN=kubernetes | Subject: CN=kubernetes | CA: true Not before: 2017-05-19 08:43:05 +0000 UTC Not After: 2027-05-17 08:43:05 +0000 UTC Public: /etc/kubernetes/pki/ca-pub.pem Private: /etc/kubernetes/pki/ca-key.pem Cert: /etc/kubernetes/pki/ca.pem <master/pki> generated API Server key and certificate: Issuer: CN=kubernetes | Subject: CN=kube-apiserver | CA: false Not before: 2017-05-19 08:43:05 +0000 UTC Not After: 2018-05-19 08:43:06 +0000 UTC Alternate Names: [192.168.239.183 100.77.0.1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] Public: /etc/kubernetes/pki/apiserver-pub.pem Private: /etc/kubernetes/pki/apiserver-key.pem Cert: /etc/kubernetes/pki/apiserver.pem <master/pki> generated Service Account Signing keys: Public: /etc/kubernetes/pki/sa-pub.pem Private: /etc/kubernetes/pki/sa-key.pem <master/pki> created keys and certificates in "/etc/kubernetes/pki" <util/kubeconfig> created "/etc/kubernetes/kubelet.conf" <util/kubeconfig> created "/etc/kubernetes/admin.conf" <master/apiclient> created API client configuration <master/apiclient> created API client, waiting for the control plane to become ready <master/apiclient> all control plane components are healthy after 42.548542 seconds <master/apiclient> waiting for at least one node to register and become ready <master/apiclient> first node is ready after 4.002406 seconds <master/apiclient> attempting a test deployment <master/apiclient> test deployment succeeded <master/discovery> created essential addon: kube-discovery, waiting for it to become ready <master/discovery> kube-discovery is ready after 24.002057 seconds <master/addons> created essential addon: kube-proxy <master/addons> created essential addon: kube-dns Kubernetes master initialised successfully! You can now join any number of machines by running the following on each node: kubeadm join --token=afde63.f3bb72d9a16f9a74 192.168.239.183 Added bootstrap token KUBE_TOKEN to /etc/cdsw/config/cdsw.conf node "dn183.pf4h.local" tainted daemonset "weave-net" created Waiting for kube-system cluster to come up. This could take a few minutes... Some pods in kube-system have not yet started. This may take a few minutes. Waiting for 10 seconds before checking again... Some pods in kube-system have not yet started. This may take a few minutes. Waiting for 10 seconds before checking again... Some pods in kube-system have not yet started. This may take a few minutes. ... ... ... Waiting for 10 seconds before checking again... ERROR:: Unable to bring up kube-system cluster.: 1 ERROR:: Unable to start clusters: 1
Created on 05-23-2017 06:50 AM - edited 05-23-2017 06:51 AM
I found the problem myself. It was a problem with the proxy settings. The proxy was also used for the internal status requests on net 100.66.0.0/24. I added all the addresses to the NO_PROXY list and now the workbench comes up. Hint in the documentation would be nice.
Created 05-23-2017 06:56 AM
Thanks for reporting back with your solution. I'll forward it to the team so they can review it.
Created 07-06-2017 04:17 AM
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
Exporting user ids...
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
The connection to the server 192.168.31.140:6443 was refused - did you specify the right host or port?
Checking system logs...
Producing logs tarball...
Logs saved to: cdsw-logs-cdh3-2017-07-06--18-46-59.tar.gz
Redacting logs...
Producing redacted logs tarball...
Redacted logs saved to: cdsw-logs-cdh3-2017-07-06--18-46-59.redacted.tar.gz
Cleaning up...
[root@cdh3 ~]# docker pull gcr.io/google_containers/pause-amd64:3.0
^C
[root@cdh3 ~]# kubectl cluster-info dump
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@cdh3 ~]# ping gcr.io
PING gcr.io (64.233.189.82) 56(84) bytes of data.
^C
--- gcr.io ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 11000ms
[root@cdh3 ~]# ping gcr.io
PING gcr.io (64.233.189.82) 56(84) bytes of data.
Created 07-06-2017 04:24 AM
Because I'm in china. I would like to ask, do I need to download the network when running csdw init initialization?
Created 07-06-2017 04:26 AM
Is there any offline setup mode for data science workbench, and please contact me if possible?. Thank you
Created 07-16-2018 01:00 PM
Hi,
Can anyone guide me to make ftp between CDSW and Unix server, to upload & download the file, using R.