I have RStudio installed and running on my edge node. I installed R on 6 datanodes running Spark2. I have several questions.
1.once you install R, sparkR should just pick it up from the default location. I dont think there is a need to setup HOME directories.
2. SparkR can run on yarn or in local mode, depending on how you submit the actual job. https://spark.apache.org/docs/latest/running-on-yarn.html
3. You need R and it dependencies installed. You can additional packages as needed from inside your sparkr jobs. i think you just install.packages command.
4. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_spark-component-guide/content/ch_spark-r... this link is a good starting point, there are internal links that give details on how spark and R integrate.
So the link to the Spark Component Guide refers to Spark version 1.6.3,. Is there different steps for Spark2?