Created on 10-09-2017 08:38 PM - edited 09-16-2022 05:22 AM
I have RStudio installed and running on my edge node. I installed R on 6 datanodes running Spark2. I have several questions.
Created 10-10-2017 03:16 PM
Hi Clay,
1.once you install R, sparkR should just pick it up from the default location. I dont think there is a need to setup HOME directories.
2. SparkR can run on yarn or in local mode, depending on how you submit the actual job. https://spark.apache.org/docs/latest/running-on-yarn.html
3. You need R and it dependencies installed. You can additional packages as needed from inside your sparkr jobs. i think you just install.packages command.
4. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_spark-component-guide/content/ch_spark-r... this link is a good starting point, there are internal links that give details on how spark and R integrate.
Created 10-10-2017 08:02 PM
So the link to the Spark Component Guide refers to Spark version 1.6.3,. Is there different steps for Spark2?
Created 10-11-2017 02:45 AM
@Clay McDonald The steps should be the same.