As HDP comes with Python 2.6, but for spark jobs would like to use python 2.7 version.
What all changes do we need to set to make only spark pick the installed 2.7 version. Thx
You need to add below options to your spark-env.sh.You should be able to run the pyspark jobs on 2.7. Let me know if you face any issue.
View solution in original post
@Sandeep Nemuri, thanks for your reply. Does it has to be done on all the nodes, or from where I'm launching the spark jobs?
If you run jobs in yarn-cluster mode then the above 3 should be set in all the nodes, you can add these through ambari so that it will push the changes to all the nodes.