Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Accessing Spark from RStudio Desktop using sparklyr and CDH 5.13.3

Accessing Spark from RStudio Desktop using sparklyr and CDH 5.13.3

New Contributor

We are trying to perform a spark-connect using RStudio desktop and we cannot figure out how to configure the connection parameters. 

 

master

Spark cluster url to connect to. Use "local" to connect to a local instance of Spark installed via spark_install.

spark_home

The path to a Spark installation. Defaults to the path provided by the SPARK_HOME environment variable. If SPARK_HOME is defined, it will always be used unless the version parameter is specified to force the use of a locally installed version.

method

The method used to connect to Spark. Default connection method is "shell" to connect using spark-submit, use "livy" to perform remote connections using HTTP, or "databricks" when using a Databricks clusters.

 

and 

 

config

Custom configuration for the generated Spark connection.

 

We fail with 

Error in system2(file.path(spark_home, "bin", "spark-submit"), "--version",  : 
  '"/opt/cloudera/parcels/SPARK2/bin/spark-submit"' not found

 

And we receive the same error when we configure SPARK_HOME to point to spark 1.6

Don't have an account?
Coming from Hortonworks? Activate your account here