Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Unable to create CML Experiment - how to debug

New Contributor


I am unable to create a support case for this, as my support license does not appear to cover this ML Experiments.

I am evaluating Cloudera CML on my company's CDP instance. I created several Experiments on a project to test out the feature (it was a simple script generating some random numbers for metrics) and the experiment created and ran ok. However, I tried to create the same experiment later and the UI just timed out. I used the developer's tools in the browser to see the request and "runs" failed with 504 timeout error. I am unable to troubleshoot the issue further in the UI, and am uncertain if ECS would have any useful information. I could get it to run successfully a few hours later. To my knowledge, nobody touched the CDP cluster during in between this period. During the period I cannot create experiments, I looked at my kubernetes namespace <CML workspace name - userXX> inside ECS, and did not see a pod starting. There was also no events being created. We would like to better understand the product before productionizing it. Could anyone advise where I might look at to debug the issue? Thank you. 


The version we're using is CML 1.4 on CDP, in an intranet network.


New Contributor

Just an update: In ECS, I see that the pods are being assigned to a particular node and they're "stuck" with a message:

"failed to create pod sandbox: rpc error:code= unknown desc = failed to get sandbox image "" failed to pull image "index.docker.rancherpause:3.6": failed to pull and unpack image "".....


New Contributor

I've raised a support case thread instead, will not be following up on this thread.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.