Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

spark2 on RStudio

Explorer

Team,

Iam getting below error while initiating a connecting to HIVE through R studio using spark2

Code:

Sys.setenv("SPARK_MAJOR_VERSION"="2") ycc <- rxSparkConnect(consoleOutput=TRUE,executorMem="4g",driverMem="4g", executorOverheadMem="4g",executorCores=1,numExecutors=5, persistentRun=TRUE,extraSparkConfig='--conf spark.yarn.queue=default')

Error:

No Spark applications can be reused. It may take 1 to 2 minutes to launch a new Spark application. Error in rxRemoteExecute(computeContext, shellCmd, schedulerJobInstance) : Error while running RevoSparkSubmitLauncher ERROR: Failed to start a Spark application, perhaps because the cluster is busy with other jobs. Please check YARN usage and try again. rxSetComputeContext(ycc) Error in is(computeContext, "RxSpark") : object 'ycc' not found

I see their is a enough resources avaliable on the cluter and for the particular queue

4 REPLIES 4

@suresh krish

Try to export "Spark_Home” to the search path for R packages so that we can use SparkR, and initialize a SparkR session.

Sys.setenv("SPARK_MAJOR_VERSION"="2")
Sys.setenv("SPARK_HOME"="/usr/hdp/2.6.1.0-129/spark2")
ycc<-rxSparkConnect(consoleOutput=TRUE,executorMem="10g",driverMem="10g",executorOverheadMem="10g",executorCores=1,numExecutors=5,extraSprakConfig="--conf 
spark.yarn.queue=default") 

More info on link: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-scaler-sparkr

Hope this helps you.

Explorer

Hi Sridhar, Thanks for your response.. I tried with pointing to spark2 conf dir but got the same error

@suresh krish

May I know what's the error message you are getting?

Explorer

Hi Sridhar, thanks for your reply . tried with the option... getting same error..