We noticed that CDSW does not seem to support starting multiple Spark sessions from one single CDSW session. This can be reproduced in multiple ways on CDSW 1.7.2 with base image version 10:
Start a CDSW session with the "Workbench - Scala" editor. This starts a Spark session automatically in the background via Apache Toree kernel. Open up the terminal built into CDSW and try to use spark-submit or spark-shell (with the default master setting yarn)
Start a CDSW session with the "Jupyter" editor. Open a notebook and start a Spark session in it (can be any type: Python, Scala/Toree, IRkernel, ...). Open another notebook and try to start a Spark session as well.
The error that is logged in YARN for the applications that correspond to the sessions that are trying to be run in parallel are:
The error leads to the YARN application not being able to start and therefore for the Spark session not being able to be created.
Is this a limitation by design, or can it be avoided by proper custom configurations?
Side note: What we tested as well is that starting multiple Spark sessions with the same user, but from separate CDSW sessions works without a problem. So this is not a limitation that our cluster does not allow running multiple Sessions with the same user.