Member since
10-18-2023
30
Posts
19
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
842 | 04-11-2024 09:23 PM | |
416 | 02-19-2024 04:12 PM |
04-13-2024
05:22 PM
1 Kudo
@upadhyayk04 logs -f -n longhorn-system longhorn-csi-plugin-cxglq Defaulted container "node-driver-registrar" out of: node-driver-registrar, longhorn-liveness-probe, longhorn-csi-plugin I0413 09:50:20.091344 290593 main.go:166] Version: v2.5.0 I0413 09:50:20.091369 290593 main.go:167] Running node-driver-registrar in mode=registration I0413 09:50:20.092527 290593 main.go:191] Attempting to open a gRPC connection with: "/csi/csi.sock" I0413 09:50:21.093286 290593 main.go:198] Calling CSI driver to discover driver name I0413 09:50:21.094471 290593 main.go:208] CSI driver name: "driver.longhorn.io" I0413 09:50:21.094497 290593 node_register.go:53] Starting Registration Server at: /registration/driver.longhorn.io-reg.sock I0413 09:50:21.094656 290593 node_register.go:62] Registration Server started at: /registration/driver.longhorn.io-reg.sock I0413 09:50:21.094779 290593 node_register.go:92] Skipping HTTP server because endpoint is set to: "" I0413 09:50:21.466617 290593 main.go:102] Received GetInfo call: &InfoRequest{} I0413 09:50:21.466820 290593 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/driver.longhorn.io/registration" I0413 09:50:23.205994 290593 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
... View more
04-12-2024
06:30 PM
@upadhyayk04 vault-0 pod goes terminating and containercreating status again again again. because of volume attach faild Warning FailedAttachVolume 108s attachdetach-controller AttachVolume.Attach failed for volume "pvc-33f9624d-4d90-48fa-8469-02a104df1d10" : rpc error: code = DeadlineExceeded desc = volume pvc-33f9624d-4d90-48fa-8469-02a104df1d10 failed to attach to node cdppvc2.hadoop.com with
... View more
04-12-2024
12:30 AM
1 Kudo
@upadhyayk04 Look all pods are fine [root@cdppvc1 ~]# k get ns NAME STATUS AGE default Active 4h56m ecs-webhooks Active 4h55m kube-node-lease Active 4h56m kube-public Active 4h56m kube-system Active 4h56m local-path-storage Active 4h55m longhorn-system Active 4h55m vault-system Active 116s [root@cdppvc1 ~]# k get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE ecs-webhooks ecs-tolerations-webhook-77d857599d-b8hsh 1/1 Running 0 39m ecs-webhooks ecs-tolerations-webhook-77d857599d-h6qxk 1/1 Running 0 39m kube-system etcd-cdppvc1.hadoop.com 1/1 Running 1 4h54m kube-system helm-install-rke2-ingress-nginx-mk845 0/1 Completed 0 10m kube-system kube-apiserver-cdppvc1.hadoop.com 1/1 Running 1 4h54m kube-system kube-controller-manager-cdppvc1.hadoop.com 1/1 Running 3 (89m ago) 4h54m kube-system kube-proxy-cdppvc1.hadoop.com 1/1 Running 0 86m kube-system kube-proxy-cdppvc2.hadoop.com 1/1 Running 0 4h53m kube-system kube-scheduler-cdppvc1.hadoop.com 1/1 Running 1 (90m ago) 4h54m kube-system rke2-canal-9h5hh 2/2 Running 0 4h53m kube-system rke2-canal-qk2wg 2/2 Running 2 (90m ago) 4h53m kube-system rke2-coredns-rke2-coredns-565dfc7d75-djp4t 1/1 Running 0 38m kube-system rke2-coredns-rke2-coredns-565dfc7d75-gvxcj 1/1 Running 0 153m kube-system rke2-coredns-rke2-coredns-autoscaler-6c48c95bf9-7ln92 1/1 Running 0 39m kube-system rke2-ingress-nginx-controller-869fc5f494-xcz6x 1/1 Running 0 39m kube-system rke2-metrics-server-c9c78bd66-blrwg 1/1 Running 0 156m kube-system rke2-snapshot-controller-6f7bbb497d-wk5mg 1/1 Running 0 39m kube-system rke2-snapshot-validation-webhook-65b5675d5c-7fst2 1/1 Running 0 39m local-path-storage local-path-provisioner-6b8fcdf4f9-fqqnw 1/1 Running 0 155m longhorn-system csi-attacher-5f79c59664-gsfc4 1/1 Running 0 156m longhorn-system csi-attacher-5f79c59664-rppmd 1/1 Running 0 156m longhorn-system csi-attacher-5f79c59664-spmmt 1/1 Running 1 (93m ago) 156m longhorn-system csi-provisioner-7f9fff657d-mvmb6 1/1 Running 0 156m longhorn-system csi-provisioner-7f9fff657d-r76kv 1/1 Running 1 (93m ago) 156m longhorn-system csi-provisioner-7f9fff657d-wm77w 1/1 Running 0 156m longhorn-system csi-resizer-7667995d7-fgkbd 1/1 Running 0 156m longhorn-system csi-resizer-7667995d7-rn5ts 1/1 Running 1 (93m ago) 156m longhorn-system csi-resizer-7667995d7-zx94l 1/1 Running 0 156m longhorn-system csi-snapshotter-56954ddc99-b44ds 1/1 Running 0 156m longhorn-system csi-snapshotter-56954ddc99-fmw8x 1/1 Running 1 (93m ago) 156m longhorn-system csi-snapshotter-56954ddc99-jkwhv 1/1 Running 0 156m longhorn-system engine-image-ei-6b4330bf-nnwmm 1/1 Running 0 4h52m longhorn-system engine-image-ei-6b4330bf-npf9k 1/1 Running 1 (90m ago) 4h52m longhorn-system instance-manager-12ec73857d1e3aea875a32230969da75 1/1 Running 0 38m longhorn-system instance-manager-ad30a9ee514d3e836de7c5077cfe5ca6 1/1 Running 0 153m longhorn-system longhorn-csi-plugin-j5xw4 3/3 Running 0 4h51m longhorn-system longhorn-csi-plugin-v7bdh 3/3 Running 6 (86m ago) 4h51m longhorn-system longhorn-driver-deployer-75c7cb9999-v8xgb 1/1 Running 0 156m longhorn-system longhorn-manager-d495r 1/1 Running 1 (90m ago) 4h52m longhorn-system longhorn-manager-nvgk7 1/1 Running 0 4h52m longhorn-system longhorn-ui-64c4bfff54-d6c7n 1/1 Running 0 156m longhorn-system longhorn-ui-64c4bfff54-vrx4q 1/1 Running 0 156m
... View more
04-11-2024
11:29 PM
1 Kudo
@upadhyayk04 Podd list [root@cdppvc1 ~]# k get pod -n longhorn-system NAME READY STATUS RESTARTS AGE csi-attacher-5f79c59664-gsfc4 1/1 Running 0 96m csi-attacher-5f79c59664-rppmd 1/1 Running 0 96m csi-attacher-5f79c59664-spmmt 1/1 Running 1 (34m ago) 96m csi-provisioner-7f9fff657d-mvmb6 1/1 Running 0 96m csi-provisioner-7f9fff657d-r76kv 1/1 Running 1 (34m ago) 96m csi-provisioner-7f9fff657d-wm77w 1/1 Running 0 96m csi-resizer-7667995d7-fgkbd 1/1 Running 0 97m csi-resizer-7667995d7-rn5ts 1/1 Running 1 (34m ago) 97m csi-resizer-7667995d7-zx94l 1/1 Running 0 97m csi-snapshotter-56954ddc99-b44ds 1/1 Running 0 97m csi-snapshotter-56954ddc99-fmw8x 1/1 Running 1 (34m ago) 97m csi-snapshotter-56954ddc99-jkwhv 1/1 Running 0 97m engine-image-ei-6b4330bf-nnwmm 1/1 Running 0 3h52m engine-image-ei-6b4330bf-npf9k 1/1 Running 1 (30m ago) 3h52m instance-manager-12ec73857d1e3aea875a32230969da75 1/1 Running 0 34m instance-manager-ad30a9ee514d3e836de7c5077cfe5ca6 1/1 Running 0 94m longhorn-csi-plugin-j5xw4 3/3 Running 0 3h52m longhorn-csi-plugin-v7bdh 3/3 Running 6 (26m ago) 3h52m longhorn-driver-deployer-75c7cb9999-v8xgb 1/1 Running 0 96m longhorn-manager-d495r 1/1 Running 1 (30m ago) 3h52m longhorn-manager-nvgk7 1/1 Running 0 3h52m longhorn-ui-64c4bfff54-d6c7n 1/1 Running 0 97m longhorn-ui-64c4bfff54-vrx4q 1/1 Running 0 97m describe pod csi-plugin Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SandboxChanged 28m (x5 over 30m) kubelet Pod sandbox changed, it will be killed and re-created. Normal Started 28m kubelet Started container longhorn-liveness-probe Normal Created 28m kubelet Created container node-driver-registrar Normal Started 28m kubelet Started container node-driver-registrar Normal Pulled 28m kubelet Container image "registry.ecs.internal/cloudera_thirdparty/longhornio/livenessprobe:v2.12.0" already present on machine Normal Created 28m kubelet Created container longhorn-liveness-probe Normal Pulled 28m kubelet Container image "registry.ecs.internal/cloudera_thirdparty/longhornio/csi-node-driver-registrar:v2.9.2" already present on machine Warning BackOff 28m (x2 over 28m) kubelet Back-off restarting failed container longhorn-csi-plugin in pod longhorn-csi-plugin-v7bdh_longhorn-system(4fe460af-df96-4006-a631-dcc21bd46a07) Normal Pulled 28m (x2 over 28m) kubelet Container image "registry.ecs.internal/cloudera_thirdparty/longhornio/longhorn-manager:v1.5.4" already present on machine Normal Created 28m (x2 over 28m) kubelet Created container longhorn-csi-plugin Normal Started 28m (x2 over 28m) kubelet Started container longhorn-csi-plugin Warning Unhealthy 27m (x3 over 28m) kubelet Liveness probe failed: Get "http://10.42.0.6:9808/healthz": dial tcp 10.42.0.6:9808: connect: connection refused Normal Killing 27m kubelet Container longhorn-csi-plugin failed liveness probe, will be restarted Warning BackOff 27m (x2 over 28m) kubelet Back-off restarting failed container node-driver-registrar in pod longhorn-csi-plugin-v7bdh_longhorn-system(4fe460af-df96-4006-a631-dcc21bd46a07) log of csi-plugin pod [root@cdppvc1 ~]# k logs -f longhorn-csi-plugin-v7bdh -n longhorn-system Defaulted container "node-driver-registrar" out of: node-driver-registrar, longhorn-liveness-probe, longhorn-csi-plugin I0412 06:02:45.498503 12176 main.go:135] Version: v2.9.2 I0412 06:02:45.498547 12176 main.go:136] Running node-driver-registrar in mode= I0412 06:02:45.498553 12176 main.go:157] Attempting to open a gRPC connection with: "/csi/csi.sock" W0412 06:02:55.498699 12176 connection.go:232] Still connecting to unix:///csi/csi.sock I0412 06:03:00.414873 12176 main.go:164] Calling CSI driver to discover driver name I0412 06:03:00.417352 12176 main.go:173] CSI driver name: "driver.longhorn.io" I0412 06:03:00.417373 12176 node_register.go:55] Starting Registration Server at: /registration/driver.longhorn.io-reg.sock I0412 06:03:00.417530 12176 node_register.go:64] Registration Server started at: /registration/driver.longhorn.io-reg.sock I0412 06:03:00.417667 12176 node_register.go:88] Skipping HTTP server because endpoint is set to: "" I0412 06:03:01.396603 12176 main.go:90] Received GetInfo call: &InfoRequest{} I0412 06:03:01.402598 12176 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
... View more
04-11-2024
11:23 PM
1 Kudo
Thank you for your answer @upadhyayk04 I can't access the longhorn UI. [Storage UI] link in ECS does not working. I think longhorn does not good status now, What should I checnk?
... View more
04-11-2024
09:23 PM
I got the reason. When enter the LDAP information I had to use port 3268 For example ldap://ad.server.host:3268 instead of ldap://ad.server.host:389
... View more
04-11-2024
09:19 PM
1 Kudo
I faced failed Initialize embedded Vault shile installing DataServices. It happened everytime evenif I try install again from start if it happened once. I tried install on system below - Red Hat Enterprise Linux release 8.4 (Ootpa) - Cloudera Manager 7.11.3 (#50275000 built by jenkins on 20240213-1404 git: 14e82e253ab970bfd576e4f80d297769a527df18) - 1.5.2-b886-ecs-1.5.2-b886.p0.46792599 / 1.5.3-b297-ecs-1.5.3-b297.p0.50802651 both I tried stdout Fri Apr 12 11:36:52 KST 2024 Running on: cdppvc1.hostname.com (192.168.10.10) JAVA_HOME=/usr/lib/jvm/java-openjdk using /usr/lib/jvm/java-openjdk as JAVA_HOME namespace/vault-system created helmchart.helm.cattle.io/vault created certificatesigningrequest.certificates.k8s.io/vault-csr created certificatesigningrequest.certificates.k8s.io/vault-csr approved secret/vault-server-tls created secret/ingress-cert created helmchart.helm.cattle.io/vault unchanged Wait 30 seconds for startup ... Timed out waiting for vault to come up stderr ++ kubectl exec vault-0 -n vault-system -- vault operator init -tls-skip-verify -key-shares=1 -key-threshold=1 -format=json error: unable to upgrade connection: container not found ("vault") ++ '[' 600 -gt 600 ']' ++ echo ... ++ sleep 10 ++ time_elapsed=610 ++ kubectl exec vault-0 -n vault-system -- vault operator init -tls-skip-verify -key-shares=1 -key-threshold=1 -format=json error: unable to upgrade connection: container not found ("vault") ++ '[' 610 -gt 600 ']' ++ echo 'Timed out waiting for vault to come up' ++ exit 1 describe pod Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 108s default-scheduler Successfully assigned vault-system/vault-0 to cdppvc2.hostname.com Warning FailedAttachVolume 108s attachdetach-controller AttachVolume.Attach failed for volume "pvc-33f9624d-4d90-48fa-8469-02a104df1d10" : rpc error: code = DeadlineExceeded desc = volume pvc-33f9624d-4d90-48fa-8469-02a104df1d10 failed to attach to node cdppvc2.hadoop.com with attachmentID csi-b57965889e8c6c2de7ffd7d045d52175b3415fa69c5e09d1cadc9c7ac1e5a467
... View more
03-18-2024
06:02 PM
3 Kudos
Version of CM is 7.11.3 and runtime is 7.1.9-1.cdh7.1.9.p3.48381316 And version ov ECS is 1.5.2-b886-ecs-1.5.2-b886.p0.46792599 CML version is 2.0.42-b80 @Surya_Sarikonda
... View more
03-14-2024
10:04 PM
1 Kudo
Stuck in 'upgrade control plane' step while upgrading ECS to 1.5.3 from 1.5.2. Casuses: cdp-release-post-upgrade-hook-job1 failed. container logs are ========================================= ************************** ************************** https://console-cdp.apps.cdppvc1.hadoop.com /etc/cdp/cdp_keys.json While list environments: Something is wrong with output, Output JSON: ____ERROR__WHILE__CALLING__LIST__ENVIRONMENTS__COMMAND____ /opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py:1103: InsecureRequestWarning: Unverified HTTPS request is being made to host 'console-cdp.apps.cdppvc1.hadoop.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings warnings.warn( An error occurred: 5 NOT_FOUND: No access key found with name: 1b098f81-ce2c-4342-ba02-f0e607b0d83c (Status Code: 401; Error Code: ; Service: environments; Operation: listEnvironments; Request ID: 80bb6513-3fe8-41b9-bf2f-e0a6834cabdf;) While list environments: Something is wrong with output, Output JSON: ____ERROR__WHILE__CALLING__LIST__ENVIRONMENTS__COMMAND____ /opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py:1103: InsecureRequestWarning: Unverified HTTPS request is being made to host 'console-cdp.apps.cdppvc1.hadoop.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings warnings.warn( An error occurred: 5 NOT_FOUND: No access key found with name: 1b098f81-ce2c-4342-ba02-f0e607b0d83c (Status Code: 401; Error Code: ; Service: environments; Operation: listEnvironments; Request ID: 5e7533e7-9167-40bf-870e-0880f2321829;) While list environments: Something is wrong with output, Output JSON: ____ERROR__WHILE__CALLING__LIST__ENVIRONMENTS__COMMAND____ /opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py:1103: InsecureRequestWarning: Unverified HTTPS request is being made to host 'console-cdp.apps.cdppvc1.hadoop.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings warnings.warn( An error occurred: 5 NOT_FOUND: No access key found with name: 1b098f81-ce2c-4342-ba02-f0e607b0d83c (Status Code: 401; Error Code: ; Service: environments; Operation: listEnvironments; Request ID: 029d4064-3f33-4340-bcb3-8b9d7d80f2c7;) Failed list environments due to JSON error, Tries exhausted Traceback (most recent call last): File "/scripts/post_upgrade_hook.py", line 36, in <module> data = json.loads(stdout) File "/usr/lib64/python3.9/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/usr/lib64/python3.9/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python3.9/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/scripts/post_upgrade_hook.py", line 42, in <module> raise ValueError('While list environments:\n' + str(e)) ValueError: While list environments: Expecting value: line 1 column 1 (char 0) =================================== Which Access key meaning in log?? What should I do to solve this problem?
... View more
03-14-2024
04:10 PM
Version of CM is 7.11.3 and runtime is 7.1.9-1.cdh7.1.9.p3.48381316 And version ov ECS is 1.5.2-b886-ecs-1.5.2-b886.p0.46792599 CML version is 2.0.42-b80
... View more