Created on 03-30-2022 01:59 PM - edited 03-30-2022 02:02 PM
What does it mean when a manually triggered job is in "Scheduling" job status? What happens behind the scenes during this Scheduling phase? Recently I have been noticing my job getting stuck in this phase for more than 10 minutes or so which is concerning.
Created 03-30-2022 02:44 PM
Hi Madhu,
Thank you for contacting us! I understand that your CDSW engine spent 10 minutes
in the scheduling phase and you would like to understand the phases a little better. Is this
correct?
The happy path for these phases is usually scheduling -> starting -> running -> succeeded. Other phases can in include stopped, timeout and failed. In general, the time spend in scheduling is waiting for resource availability for the engine's pod.
If this is a persistent issue with your CDSW engines, recommend opening a support case so that we can take a closer look at the logs around the time of one of the occurrences.
Created 03-30-2022 02:44 PM
Hi Madhu,
Thank you for contacting us! I understand that your CDSW engine spent 10 minutes
in the scheduling phase and you would like to understand the phases a little better. Is this
correct?
The happy path for these phases is usually scheduling -> starting -> running -> succeeded. Other phases can in include stopped, timeout and failed. In general, the time spend in scheduling is waiting for resource availability for the engine's pod.
If this is a persistent issue with your CDSW engines, recommend opening a support case so that we can take a closer look at the logs around the time of one of the occurrences.
Created 03-30-2022 04:41 PM
Thanks for your advice, I have raised a support case.
Created 03-30-2022 03:33 PM
When this is happening, are you able to start Sessions as well?
Do you have access to the Admin -> Usage page or kubectl access? You should look to see if there are enough resources available for the engine that you have chosen to use when running the job.
Created 03-30-2022 04:43 PM
Unfortunately I do not have admin access. I have raised a support case to understand the details better. Thanks for your response.