Member since
03-25-2017
2
Posts
0
Kudos Received
0
Solutions
03-25-2017
06:37 AM
Hi, I'm Aki. Thanks for your posting. According to your output message, I found your spark_home setting is wrong. /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/bin/../lib/spark/bin/spark-submit: line 27: /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/bin/spark-class: No such file or directory `spark-class` should found in `/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/bin/spark-class`. So, you should set /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark for `spark_home` not /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib , or you can use just /opt/cloudera/parcels/CDH/lib/spark So, you can set an environmental variable as follows: Sys.setenv(SPARK_HOME = "/opt/cloudera/parcels/CDH/lib/spark") or, could you try to copy and paste following code? As far as I saw your output, it seems you didn't try my code. I don't know the difference between patch version of spark would affect to connection, but I guess you may be able to replace spark_version as "1.6.0". config <- spark_config()
config$spark.driver.cores <- 4
config$spark.executor.cores <- 4
config$spark.executor.memory <- "4G"
spark_home <- "/opt/cloudera/parcels/CDH/lib/spark"
spark_version <- "1.6.2"
sc <- spark_connect(master="yarn-client", version=spark_version, config=config, spark_home=spark_home)
... View more
03-25-2017
06:03 AM
I think this blog post will help you. If you use REHL/CentOS, you can refer the following script: https://github.com/jordanvolz/director-sparklyr-bootstrap/blob/master/sparklyr-bootstrap.sh
... View more