Created on 09-11-2018 09:50 PM - edited 09-16-2022 06:41 AM
I have two versions on python installed (2.6 and 2.7) Spark jobs run thru shell in pyspark are picking up one version of Python (2.7). Jobs submitted to the cluster via yarn are picking up the 2.6 version of python. How can I get yarn jobs to point to the 2.7 version?
Created 09-11-2018 10:17 PM
@Jon Page Try these before running spark-submit command:
export PYSPARK_DRIVER_PYTHON=/opt/anaconda2/bin/python
export PYSPARK_PYTHON=/opt/anaconda2/bin/python
/opt/anaconda2/bin/python should be the location of your 2.7 python (this should be same across all clsuter nodes)
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 09-11-2018 10:17 PM
@Jon Page Try these before running spark-submit command:
export PYSPARK_DRIVER_PYTHON=/opt/anaconda2/bin/python
export PYSPARK_PYTHON=/opt/anaconda2/bin/python
/opt/anaconda2/bin/python should be the location of your 2.7 python (this should be same across all clsuter nodes)
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 09-12-2018 09:50 PM
Thanks, this did work for me!
Is there a way to configure the hadoop cluster to use a specific installed version of python?