03-04-2016 06:55 AM
I need to change the python that is being used with my CDH5.5.1 cluster. My research pointed me to set PYSPARK_PYTHON in spark-env.sh. I tried that manually without success. I then used Cloudera Manager to set the variable in both the 'Spark Service Environment Advanced Configuration Snippet' and 'Spark Service Advanced Configuration Snippet' & about everywhere else that referenced spark-env-sh. This hasn't worked and I'm at a lost where to go next.
03-04-2016 08:24 AM
You need to add the PYSPARK_PYTHON variable to the YARN configuration :
`YARN (MR2 Included) Service Environment Advanced Configuration Snippet (Safety Valve)`
Do that, restart the cluster and you are good to go.
04-21-2016 01:08 PM
Dear SparkeyG, pls would you elaborate on how to add the PYSPARK_PYTHON variable to YARN configuration snippet as per your suggestion? What format do I need to use for the snippet? Would you be so kind and post an example?
04-21-2016 01:16 PM
From cloudera manager, select Clusters->Spark
In the search box in the filters box, search for 'Service Environment Advanced'
in the the Spark Service Environment Advanced Configuration Snippet (Safety Valve) box enter something like:
Click Save Changes ; then Distribute the changes and restart your spark cluster