question Error in Scala/Spark Project on Cloudera Data Science Workbench in Archives of Support Questions (Read Only)

Error in Scala/Spark Project on Cloudera Data Science Workbench

MGarg — Tue, 21 Apr 2026 13:29:21 GMT

We have a Hadoop cluster with ACLs for YARN resource pools.

I am trying to create a Scala/Spark project within CDSW, but it throws the following error as soon as the engine starts:

ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1495197568507_9413 to YARN : Application rejected by queue placement policy

I know I need to tell it to use a specific Yarn resource pool, but I don't know how/where to put that parameter so that it can take effect. I tried setting it up as a parameter in engine settings, but that didn't work.

Does anyone any idea about it?

Thanks in advance!

Re: Error in Scala/Spark Project on Cloudera Data Science Workbench

MGarg — Wed, 24 May 2017 20:31:49 GMT

Okay - After much research I found a way to configure Yarn resource pool within the Spark/Scala Project and here are the steps:

1. Create Scala Project and start the engine

2. Engine startup will fail the very first time.

3. Open "Terminal" in the Workbench window and do the following:

i. Verify that you are in /home/cdsw directory.

ii. Create a file named "spark-defaults.conf" and add "spark.yarn.queue={QUEUE_NAME}"

iii. Save and exit.

4. Stop and start the engine again and the issue will be resolved.

Regards,

Re: Error in Scala/Spark Project on Cloudera Data Science Workbench

tristanzajonc — Thu, 25 May 2017 19:34:29 GMT

MG,

I'm glad you figured this out. You can configure the YARN queue, or any Spark option, either globally using Cloudera Manager or on a per project basis within Cloudera Data Science Workbench. It sounds like you figured this out already, but the documentation for these two options is here:

https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_spark_configuration.html#config_files

Configuring this option globally may make more sense, unless you're using a queue specifically for Cloudera Data Science Workbench launched Spark jobs.

Best,

Tristan

Re: Error in Scala/Spark Project on Cloudera Data Science Workbench

MGarg — Thu, 25 May 2017 19:39:05 GMT

Hi Tristan,

You are right, configuring it globally was much easier, but we have tenant specific queues and we want to keep them contained within their pools, which is why we needed Engine/Project specific setting.

Anyways, thanks for your response.

Regards,