Reply
Explorer
Posts: 10
Registered: ‎03-29-2017

cdsw ingress-controller CrashLoopBackOff

Hi there !

 

i'm facing issues to install Cloudera Data Science Workbench

Cloudera Manager version: 5.12.0
CDH version: 5.9.1

Configurations in /etc/cdsw/config/cdsw.conf:

DOMAIN="cdsw.my.domain.abc.com"
MASTER_IP="private.ip.address"
DOCKER_BLOCK_DEVICES="/dev/xvdf"
APPLICATION_BLOCK_DEVICE=""
JAVA_HOME="/usr/java/jdk1.8.0_91"
TLS_ENABLE="true"
TLS_CERT="/opt/cdsw/cdsw.cer"
TLS_KEY="/opt/cdsw/cdsw.key"

 

This is the Service status:

 

cdsw status

 

Cloudera Data Science Workbench Status

Service Status
docker: active
kubelet: active
nfs: active
Checking kernel parameters...

Node Status
NAME STATUS AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION STATEFUL
ip-10-28-0-47.ec2.internal Ready 8m v1.6.2 <none> CentOS Linux 7 (Core) 3.10.0-327.10.1.el7.x86_64 true

GPUs present on nodes:
ip-10-28-0-47.ec2.internal ==>

System Pod status
NAME READY STATUS RESTARTS AGE IP NODE
etcd-ip-10-28-0-47.ec2.internal 1/1 Running 0 8m 10.28.0.47 ip-10-28-0-47.ec2.internal
kube-apiserver-ip-10-28-0-47.ec2.internal 1/1 Running 0 8m 10.28.0.47 ip-10-28-0-47.ec2.internal
kube-controller-manager-ip-10-28-0-47.ec2.internal 1/1 Running 0 8m 10.28.0.47 ip-10-28-0-47.ec2.internal
kube-dns-3913472980-s65z9 3/3 Running 0 7m 100.66.0.2 ip-10-28-0-47.ec2.internal
kube-proxy-tknz5 1/1 Running 0 7m 10.28.0.47 ip-10-28-0-47.ec2.internal
kube-scheduler-ip-10-28-0-47.ec2.internal 1/1 Running 0 8m 10.28.0.47 ip-10-28-0-47.ec2.internal
node-problem-detector-v0.1-8lnhz 1/1 Running 0 7m 10.28.0.47 ip-10-28-0-47.ec2.internal
weave-net-1zn6z 2/2 Running 0 7m 10.28.0.47 ip-10-28-0-47.ec2.internal

Cloudera Data Science Workbench Pod Status
NAME READY STATUS RESTARTS AGE IP NODE ROLE
cron-891682548-28klz 1/1 Running 0 7m 100.66.0.7 ip-10-28-0-47.ec2.internal cron
db-3917338260-gdbt3 1/1 Running 0 7m 100.66.0.8 ip-10-28-0-47.ec2.internal db
db-migrate-da59eb6-2p24s 0/1 Completed 0 7m 100.66.0.4 ip-10-28-0-47.ec2.internal db-migrate
engine-deps-kmjsx 1/1 Running 0 7m 100.66.0.3 ip-10-28-0-47.ec2.internal engine-deps
ingress-controller-144817096-5llhp 0/1 CrashLoopBackOff 6 7m 10.28.0.47 ip-10-28-0-47.ec2.internal ingress-controller
livelog-4029644909-nx57s 1/1 Running 0 7m 100.66.0.6 ip-10-28-0-47.ec2.internal livelog
reconciler-3820891121-dtd17 1/1 Running 0 7m 100.66.0.5 ip-10-28-0-47.ec2.internal reconciler
spark-port-forwarder-n4xjg 1/1 Running 0 7m 100.66.0.9 ip-10-28-0-47.ec2.internal spark-port-forwarder
web-1750516886-h4n4h 1/1 Running 0 7m 100.66.0.11 ip-10-28-0-47.ec2.internal web
web-1750516886-hl1n4 1/1 Running 0 7m 100.66.0.4 ip-10-28-0-47.ec2.internal web
web-1750516886-shpx1 1/1 Running 0 7m 100.66.0.10 ip-10-28-0-47.ec2.internal web


WARNING: Some pods in the CDSW application are not yet up.: 1

ERROR:: Cloudera Data Science Workbench is not ready yet: some application pods are not ready: 1

I'm not sure why I'm facing this issue even after following the cloudera documentation for the setup.

Can someone please help me fix this ?

Explorer
Posts: 10
Registered: ‎03-29-2017

Re: cdsw ingress-controller CrashLoopBackOff

This is the error i can see in cdsw logs:

 

WARNING: Some pods in the CDSW application are not yet up.: 1

ERROR:: Cloudera Data Science Workbench is not ready yet: some application pods are not
ready: 1
---- EXIT CODE 1
---- RUN cdsw validate
Checking services...
Checking if docker is active and enabled
Checking if docker is responsive
Checking if kubelet is active and enabled
Testing networking setup...
Check if kubelet iptables rules exist
Check that firewalld is disabled
Check configuration file...
Checking master node filesystem configuration...
Checking kubernetes
Checking system pods
Checking application pods exist
Checking application pods are running

ERROR:: Application pod ingress-controller-144817096-5llhp is not running, its state is
CrashLoopBackOff: 1
---- EXIT CODE 1

Highlighted
Explorer
Posts: 10
Registered: ‎03-29-2017

Re: cdsw ingress-controller CrashLoopBackOff

There was an issue with the TLS certificate.

I changed that and executed `cdsw reset` and `cdsw init` which resolved the issue.

Announcements