Hi, I'm Aki. Thanks for your posting. According to your output message, I found your spark_home setting is wrong. /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/bin/../lib/spark/bin/spark-submit: line 27: /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/bin/spark-class: No such file or directory `spark-class` should found in `/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/bin/spark-class`. So, you should set /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark for `spark_home` not /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib , or you can use just /opt/cloudera/parcels/CDH/lib/spark So, you can set an environmental variable as follows: Sys.setenv(SPARK_HOME = "/opt/cloudera/parcels/CDH/lib/spark") or, could you try to copy and paste following code? As far as I saw your output, it seems you didn't try my code. I don't know the difference between patch version of spark would affect to connection, but I guess you may be able to replace spark_version as "1.6.0". config <- spark_config()
config $ spark.driver.cores <- 4
config $ spark.executor.cores <- 4
config $ spark.executor.memory <- " 4G "
spark_home <- " /opt/cloudera/parcels/CDH/lib/spark "
spark_version <- " 1.6.2 "
sc <- spark_connect( master = " yarn-client " , version = spark_version , config = config , spark_home = spark_home )
... View more
I think this blog post will help you. If you use REHL/CentOS, you can refer the following script: https://github.com/jordanvolz/director-sparklyr-bootstrap/blob/master/sparklyr-bootstrap.sh
... View more