Created 02-06-2018 03:40 PM
I'm trying to start a sparkR session from within an R session; not through spark-submit. However, when i try to set a queue like this, it doesn't work:
sparkR.session(queue ="queue_name")
The only way I can get it to actually set that queue is to use the old ```init()``` function, which thows a warning:
sc <- SparkR::sparkR.init(master = "yarn-client", sparkEnvir = list(spark.yarn.queue="a2_hungry")) hiveContext <- sparkRHive.init(sc)
Warning message: 'SparkR::sparkR.init' is deprecated. Use 'sparkR.session' instead. See help("Deprecated")
How can I set a queue in the non-deprecated way?
Created 02-07-2018 08:19 AM
Are you using Spark 2.x ? The API has changed from Spark2.
Pelase see below : https://spark.apache.org/docs/2.0.0/api/R/sparkR.session.html
## Not run:
##D sparkR.session()
##D df <- read.json(path)
##D
##D sparkR.session("local[2]", "SparkR", "/home/spark")
##D sparkR.session("yarn-client", "SparkR", "/home/spark",
##D list(spark.executor.memory="4g"),
##D c("one.jar", "two.jar", "three.jar"),
##D c("com.databricks:spark-avro_2.10:2.0.1"))
##D sparkR.session(spark.master = "yarn-client", spark.executor.memory = "4g")
## End(Not run)
Created 02-07-2018 02:23 PM
@Sandeep Nemuri okay... so how can I use that to set the queue? Would it be a list element in the sparkConfig argument?
Created 02-27-2018 04:52 PM
When starting spark R, the Spark Session is already generated.
You need to stop the current session and spin up a new one to set the desired settings.
I use the following
sparkR.stop() sparkR.session( # master="local[2]", # local master master="yarn", # cluster master appName="my_sparkR", sparkConfig=list( spark.driver.memory="4g", spark.executor.memory="2g", spark.yarn.queue="your_desired_queue" ) )
Verify from the Spark monitoring page that the settings updated correctly.