Created on 09-09-2019 01:21 PM - last edited on 09-09-2019 02:13 PM by ask_bill_brooks
I have installed cdsw trail version before purchasing the product but stuck with errors.
500 gig block device and using default /var/lib/ for application.
CDSW disks provisioned via ESX/VMware.
CDSW starts fine in CM but failed to launch containers. Here is the status.
Failed events are below for web-5dddb59597-pclf2 (redacted host name for security reasons).
Not sure why tcp-ingress-controller-7d65df6d8f-qqxzn is starting , i am not using TLS for CDSW
Any help is highly appreciated.
NAME READY STATUS RESTARTS AGE
cdsw-compute-pod-evaluator-566f7664df-rqm2j 1/1 Running 0 9m38s
cron-6dd99c7b78-kh4rw 1/1 Running 0 9m40s
db-86bbb69b54-nxz4z 1/1 Running 0 9m40s
db-migrate-46715e4-nf5dh 0/1 Completed 0 9m40s
ds-cdh-client-578fc69b9-hcqv7 1/1 Running 0 9m38s
ds-operator-66ff9c4b78-r6bk7 1/1 Running 0 9m38s
ds-reconciler-64bccdd574-vhhpz 0/1 CrashLoopBackOff 4 9m38s
ds-vfs-648f9bddf-7n6hg 1/1 Running 0 9m38s
image-puller-xwbcs 1/1 Running 0 9m39s
ingress-controller-7c8bb7898b-2k4zk 1/1 Running 0 9m40s
livelog-65f9846677-5j8js 1/1 Running 0 9m40s
livelog-publisher-8c89k 1/1 Running 2 9m39s
s2i-builder-b775b49c4-62t56 1/1 Running 0 9m37s
s2i-builder-b775b49c4-vzmwb 1/1 Running 0 9m37s
s2i-builder-b775b49c4-w7hck 1/1 Running 0 9m37s
s2i-client-5658cd7456-8shrb 1/1 Running 0 9m37s
s2i-git-server-6d7d8ccdd8-k4srm 1/1 Running 0 9m40s
s2i-queue-6c699c5f9b-csxns 1/1 Running 0 9m40s
s2i-registry-5db6db6859-rjlkz 1/1 Running 0 9m40s
s2i-registry-auth-76cb9bf597-gmbxp 1/1 Running 0 9m40s
s2i-server-58db56cf6-7t5fb 1/1 Running 0 9m40s
secret-generator-7ffbcd8fcb-wn58c 1/1 Running 0 9m39s
spark-port-forwarder-bfvdm 1/1 Running 0 9m39s
tcp-ingress-controller-7d65df6d8f-qqxzn 0/1 CrashLoopBackOff 5 9m39s
web-5dddb59597-b7sp5 1/1 Running 1 9m39s
web-5dddb59597-m88qn 1/1 Running 1 9m39s
web-5dddb59597-pclf2 0/1 Error 6 9m39s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned default/web-5dddb59597-pclf2 to xyz.copration
Warning FailedMount 11m (x3 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "ds-operator-crt" : secrets "ds-operator-tls" not found
Warning FailedMount 11m (x3 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "tcp-ingress-controller-crt" : secrets "tcp-ingress-controller-tls" not found
Warning FailedMount 11m (x4 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "s2i-server-crt" : secrets "s2i-server-tls" not found
Warning FailedMount 11m (x4 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "s2i-client-crt" : secrets "s2i-client-tls" not found
Warning FailedMount 11m (x4 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "ds-vfs-crt" : secrets "ds-vfs-tls" not found
Warning FailedMount 11m (x3 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "web-tls" : secrets "web-tls" not found
Warning FailedMount 11m (x4 over 12m) kubelet, xyz.copration MountVolume.SetUp failed for volume "ds-cdh-client-crt" : secrets "ds-cdh-client-tls" not found
Normal Pulled 7m19s (x5 over 9m24s) kubelet, xyz.copration Container image "docker-registry.infra.cloudera.com/cdsw/web:46715e4" already present on machine
Warning BackOff 2m23s (x33 over 9m2s) kubelet, xyz.copration Back-off restarting failed container
Created on 09-18-2019 09:48 AM - edited 09-18-2019 09:49 AM
@SrJay it looks like you are running CDSW 1.6 on VMWare host which has ipv6 disabled. Can you please confirm by reviewing the dmesg for words cmdline and segfaults? If you see segmentation faults for node process and if the Cmdline shows ipv6.disabled=1, then you are likely hitting a known issue which is seen with a combination of node.js version 10.x, grpc, and ipv6 The workaround for this is to enable ipv6 on all the hosts running CDSW using the following RedHat article https://access.redhat.com/solutions/8709#rhel7enable
Created on 09-18-2019 09:48 AM - edited 09-18-2019 09:49 AM
@SrJay it looks like you are running CDSW 1.6 on VMWare host which has ipv6 disabled. Can you please confirm by reviewing the dmesg for words cmdline and segfaults? If you see segmentation faults for node process and if the Cmdline shows ipv6.disabled=1, then you are likely hitting a known issue which is seen with a combination of node.js version 10.x, grpc, and ipv6 The workaround for this is to enable ipv6 on all the hosts running CDSW using the following RedHat article https://access.redhat.com/solutions/8709#rhel7enable