Support Questions

backtohome · ‎01-31-2026

Hi everyone.

Just want to check as of latest CML version, has there been any support for using GPU for Spark in CML ?

I have seen:

1. GPU used for Spark in CDE, or

2. GPU used in general Python project ( using torch, tensorflow, rapids' cudf/cuml) in CML,

but can't find any for Spark in CML.

If there is no such support; is there any specific reason why it cannot be done while it can be done for general Spark on k8s.

Thanks for any info...

backtohome · ‎02-02-2026

Seems like the message was there all along when we choose the runtime, e.g. Jupyterlab -> Python xx -> Edition: Nvidia GPU, and enable Spark; the message will appear:

"Spark is not compatible with the selected Edition. If you enable Spark for the session, it can be used independently but it will not be accelerated"

I didnt see the warning message before because have only allowed our own customized runtime, which didn't display this warning message.

View solution in original post

vafs · ‎02-02-2026

Hello @backtohome,

So far I know, we do support GPUs for Spark workloads on CML.
The documentation talks about that:

Autoscaling: Cloudera AI also supports native cloud autoscaling via Kubernetes. When clusters do not have the required capacity to run workloads, they can automatically scale up additional nodes. Administrators can configure auto-scaling upper limits, which determine how large a compute cluster can grow. Since compute costs increase as cluster size increases, having a way to configure upper limits gives administrators a method to stay within a budget. Autoscaling policies can also account for heterogeneous node types such as GPU nodes.
https://docs.cloudera.com/machine-learning/1.5.5/spark/topics/ml-apache-spark-overview.html

You have to configure them by following this doc:
https://docs.cloudera.com/machine-learning/1.5.5/gpu/topics/ml-gpu.html

If you do not have the GPUs configured on CML, the UI will not show you the options, such like this:

Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.

backtohome · ‎02-02-2026

Thanks for the reply @vafs ; I can't be sure that was the definite answer though, because it is combining answers from two different sources:
- first one is about supporting spark (nothing about gpu mentioned).
- second one is about gpu (nothing about spark mentioned).

And when I tried a Spark code that works in YARN+GPU, with slight modification to fit into CML, it just didnt go well. Not sure if I've done something wrong, that's why I am looking for a definite answer, probably with some github example like what Cloudera have provided for pytorch and tensorflow for CML. Hence, me raising this question.

Somehow, I kind of remembering seeing somewhere that in CDE, it is only Technical Preview, but in CML it is not yet; but can't seem to be able to find where was that page :).

backtohome · ‎02-02-2026

Seems like the message was there all along when we choose the runtime, e.g. Jupyterlab -> Python xx -> Edition: Nvidia GPU, and enable Spark; the message will appear:

"Spark is not compatible with the selected Edition. If you enable Spark for the session, it can be used independently but it will not be accelerated"

I didnt see the warning message before because have only allowed our own customized runtime, which didn't display this warning message.

Cloudera Community

Support Questions

GPU in CML Spark