Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Unable to create CML Experiment - how to debug

New Contributor

Hi,

I am unable to create a support case for this, as my support license does not appear to cover this ML Experiments.

I am evaluating Cloudera CML on my company's CDP instance. I created several Experiments on a project to test out the feature (it was a simple script generating some random numbers for metrics) and the experiment created and ran ok. However, I tried to create the same experiment later and the UI just timed out. I used the developer's tools in the browser to see the request and "runs" failed with 504 timeout error. I am unable to troubleshoot the issue further in the UI, and am uncertain if ECS would have any useful information. I could get it to run successfully a few hours later. To my knowledge, nobody touched the CDP cluster during in between this period. During the period I cannot create experiments, I looked at my kubernetes namespace <CML workspace name - userXX> inside ECS, and did not see a pod starting. There was also no events being created. We would like to better understand the product before productionizing it. Could anyone advise where I might look at to debug the issue? Thank you. 

 

The version we're using is CML 1.4 on CDP, in an intranet network.

2 REPLIES 2

New Contributor

Just an update: In ECS, I see that the pods are being assigned to a particular node and they're "stuck" with a message:

"failed to create pod sandbox: rpc error:code= unknown desc = failed to get sandbox image "index.docker.io/rancher/pause:3.6" failed to pull image "index.docker.rancherpause:3.6": failed to pull and unpack image "docker.io/rancher/pause:3.6".....

 

New Contributor

I've raised a support case thread instead, will not be following up on this thread.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.