Reply
Explorer
Posts: 33
Registered: ‎05-21-2017

Node stateful is none

I have installed work bench on 5 node cluster. Everything looks good. But when i checked cdsw status its showing stateful as <none>

Node Status
NAME          STATUS    AGE       STATEFUL
hostname1   Ready     1h        true
hostname2   Ready     1h        <none>
hostname3   Ready     1h        <none>
hostname4   Ready     1h        <none>
hostname5   Ready     1h        <none>

And when i launched workbench from cdsw.company.com, its forever showing " ContainerCreating: Creating engine container." and input field is blinking in red color

Cloudera Employee
Posts: 29
Registered: ‎04-28-2017

Re: Node stateful is none

The <none> indicator is not an issue -- it simply indicates that those nodes are worker nodes and don't have stateful information stored on them.

 

Hanging engines on "ContainerCreating" typically means you have not run "cdsw enable <worker-ip>" on the master node for all your worker nodes.  This whitelists the IP of your worker nodes for NFS mounts.  If you have not done this, containers can hang waiting for the project mounts to become available when scheduled onto a worker node.

 

Please let me know if running "cdsw enable" for each worker IP resolves this issue.

 

Thanks,

Tristan

NES
Explorer
Posts: 12
Registered: ‎07-03-2017

Re: Node stateful is none

hi

 

i am also expiriencing the same error.

 

my bench has 2 workers and master. the workers are enable and with "cdsw enable" i can see that they are running.

 

Regards

Nes

Explorer
Posts: 33
Registered: ‎05-21-2017

Re: Node stateful is none

[ Edited ]

i have tried below,

 

Test 1:

cdsw enable "worker_node_ip"

 

Result: Same issue

 

Test 2: Removed nodes from the cluster, added again

Result: Same issue

 

Test 3: Reset Master and workers, performed "cdsw init" and "cdsw enable worker_ip" on master and "cdsw join"

Result: Same issue

 

Still getting " ContainerCreating: Creating engine container."

 

Admin --> Site Administration--> Overview

 


overview.JPG
Cloudera Employee
Posts: 29
Registered: ‎04-28-2017

Re: Node stateful is none

Could you please give the output for:

 

kubectl get events

kubectl logs <stuck-pod-id> engine

 

Tristan

Explorer
Posts: 33
Registered: ‎05-21-2017

Re: Node stateful is none

Hi Tristan,

 

"kubectl get events" didn't gave any ouput.

 

Below is the "cdsw status" output

[root@hostname ~]# cdsw status
Cloudera Data Science Workbench Status

Service Status
docker: active
kubelet: active
nfs: active
Checking kernel parameters...

Node Status
NAME     STATUS    AGE       STATEFUL
master    Ready     17h       true
worker1   Ready     17h       <none>
worker2   Ready     17h       <none>
worker3   Ready     17h       <none>
worker3   Ready     17h       <none>

System Pod status
NAME                                                     READY     STATUS    RESTARTS   AGE       IP            NODE
dummy-2088944543-c4pbg                                   1/1       Running   0          17h       10.x.x.x   master
etcd-master                      						 1/1       Running   0          17h       10.x.x.x   master
kube-apiserver-master            						 1/1       Running   0          17h       10.x.x.x   master
kube-controller-manager-master   						 1/1       Running   0          17h       10.x.x.x   master
kube-discovery-1150918428-se35m                          1/1       Running   0          17h       10.x.x.x   master
kube-dns-3873593988-olmcy                                3/3       Running   0          17h       100.66.0.2    master
kube-proxy-cr019                                         1/1       Running   0          17h       10.x.x.x   master
kube-proxy-o316l                                         1/1       Running   0          17h       10.x.x.x   worker3
kube-proxy-txbph                                         1/1       Running   0          17h       10.x.x.x   worker2
kube-proxy-u0riv                                         1/1       Running   0          17h       10.x.x.x   worker3
kube-proxy-xf6ta                                         1/1       Running   0          17h       10.x.x.x   worker1
kube-scheduler-master            						 1/1       Running   0          17h       10.x.x.x   master
node-problem-detector-v0.1-7zp8i                         1/1       Running   0          17h       10.x.x.x   worker1
node-problem-detector-v0.1-be2cf                         1/1       Running   0          17h       10.x.x.x   worker3
node-problem-detector-v0.1-ej7yx                         1/1       Running   0          17h       10.x.x.x   worker2
node-problem-detector-v0.1-maik6                         1/1       Running   0          17h       10.x.x.x   master
node-problem-detector-v0.1-xf9o0                         1/1       Running   0          17h       10.x.x.x   worker3
weave-net-31402                                          2/2       Running   0          17h       10.x.x.x   worker1
weave-net-71t9s                                          2/2       Running   0          17h       10.x.x.x   worker3
weave-net-8p26z                                          2/2       Running   0          17h       10.x.x.x   worker3
weave-net-m4e8x                                          2/2       Running   0          17h       10.x.x.x   worker2
weave-net-wfd35                                          2/2       Running   0          17h       10.x.x.x   master

Cloudera Data Science Workbench Pod Status
NAME                                  READY     STATUS      RESTARTS   AGE       IP             NODE                ROLE
cron-2934152315-ymbxu                 1/1       Running     0          17h       100.66.0.8     master   			cron
db-39862959-s2ic8                     1/1       Running     0          17h       100.66.0.4     master				db
db-migrate-052787a-170ff              0/1       Completed   0          17h       100.66.0.5     master   			db-migrate
engine-deps-3uqcr                     1/1       Running     0          17h       100.66.0.3     master   			engine-deps
engine-deps-6npbb                     1/1       Running     0          17h       100.66.192.1   worker1   			engine-deps
engine-deps-m2385                     1/1       Running     0          17h       100.66.64.1    worker2   			engine-deps
engine-deps-qgcwy                     1/1       Running     0          17h       100.66.128.1   worker3   			engine-deps
engine-deps-zblkz                     1/1       Running     0          17h       100.66.160.1   worker3   			engine-deps
ingress-controller-3138093376-nx1wi   1/1       Running     0          17h       10.x.x.x    	master   			ingress-controller
livelog-1900214889-bqhf7              1/1       Running     0          17h       100.66.0.7     master   			livelog
reconciler-459456250-ma02c            1/1       Running     0          17h       100.66.0.6     master   			reconciler
spark-port-forwarder-0yxno            1/1       Running     0          17h       10.x.x.x    	worker2   			spark-port-forwarder
spark-port-forwarder-86dv2            1/1       Running     0          17h       10.x.x.x    	worker3   			spark-port-forwarder
spark-port-forwarder-l2u4k            1/1       Running     0          17h       10.x.x.x    	worker1   			spark-port-forwarder
spark-port-forwarder-lpwms            1/1       Running     0          17h       10.x.x.x    	master   			spark-port-forwarder
spark-port-forwarder-rsx25            1/1       Running     0          17h       10.x.x.x    	worker3   			spark-port-forwarder
web-3826671331-0n92g                  1/1       Running     0          17h       100.66.0.10    master   			web
web-3826671331-my2vs                  1/1       Running     0          17h       100.66.0.9     master   			web
web-3826671331-zva8n                  1/1       Running     0          17h       100.66.0.5     master   			web

Cloudera Data Science Workbench is ready!

"kubectl logs <stuck-pod-id> engine" no pod in stuck mod, its stuck while launching te container in WebUI, Kindly check below screenshot.

 

container launch error.JPG

 

Thanks

Krishna

Cloudera Employee
Posts: 24
Registered: ‎07-09-2015

Re: Node stateful is none

Hi Krishna,

 

When you start a new session in the workbench you should see a new pod in the list:

gy17uw2d8p5gpuh1                      3/3       Running     0          14m       100.66.0.18     hostname   console 

 

We would like to see the logs for this pod.

 

Regards,

Peter

Explorer
Posts: 33
Registered: ‎05-21-2017

Re: Node stateful is none

Below is the output of "kubectl logs pod_id engine"

 

[root@hostname ~]# kubectl logs ubgujdi5b9b6mmwr engine
2017-07-11 09:06:10.097 9       INFO    Engine                          Waiting one second for Spark config...  data = {"id":"ubgujdi5b9b6mmwr"}
2017-07-11 09:06:11.186 15      INFO    Engine                          Waiting one second for Spark config...  data = {"id":"ubgujdi5b9b6mmwr"}
/var/lib/cdsw/config/startup.sh: line 31: undefined: command not found
2017/07/11 09:06:12 Loading config file at: /var/lib/cdsw/deps/terminal-conf
2017/07/11 09:06:12 Permitting clients to write input to the PTY.
2017/07/11 09:06:12 Server is starting with command: /bin/bash
2017/07/11 09:06:12 URL: http://0.0.0.0:8000/xmfwh64etire9k0l/
2017/07/11 09:06:13 100.66.0.1:51408 301 GET /xmfwh64etire9k0l
2017-07-11 09:06:13.636 7       INFO    Engine                          ubgujdi5b9b6mmwr                Start  Authenticating to livelog  data = {"secondsSinceStartup":0.85}
Livelog Open
2017-07-11 09:06:13.679 7       INFO    Engine                          ubgujdi5b9b6mmwr                Finish Authenticating to livelog: success data = {"secondsSinceStartup":0.898}
2017-07-11 09:06:13.680 7       INFO    Engine                          ubgujdi5b9b6mmwr                Start  Searching for engine module        data = {"secondsSinceStartup":0.9}
2017-07-11 09:06:14.410 7       INFO    Engine                          ubgujdi5b9b6mmwr                Finish Searching for engine module: success       data = {"engineModule_path":"/usr/local/lib/node_modules/python2-engine"}
2017-07-11 09:06:14.410 7       INFO    Engine                          ubgujdi5b9b6mmwr                Start  Creating engine    data = {"secondsSinceStartup":1.63}
PID of parser IPython process is 59
PID of main IPython process is 62
2017-07-11 09:06:14.661 7       INFO    Engine                          ubgujdi5b9b6mmwr                Finish Creating engine    data = {"secondsSinceStartup":1.88}
2017-07-11 09:06:22.492 7       INFO    Engine                          ubgujdi5b9b6mmwr                Start  Registering running status data = {"useHttps":false,"host":"100.77.0.130","path":"/api/v1/projects/Krishna/test1/dashboards/ubgujdi5b9b6mmwr/register-status","senseDomain":"cdsw.adobe.com"}
2017-07-11 09:06:22.505 7       INFO    Engine                          ubgujdi5b9b6mmwr                Finish Registering running status: success
2017-07-11 09:06:22.506 7       INFO    Engine                          ubgujdi5b9b6mmwr                Pod is ready    data = {"secondsSinceStartup":9.726,"engineModuleShare":8.096}
Highlighted
Explorer
Posts: 33
Registered: ‎05-21-2017

Re: Node stateful is none

Any update @tristanzajonc  @peter.ableda

New Contributor
Posts: 1
Registered: ‎07-24-2017

Re: Node stateful is none

 

 

open your development tools of your browser and go to console, i guess you will find an error there.

i had the same problem and found an error there that pointed me to a wildcard dns problem.

 

 

Announcements