Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

multiple versions of python issues

Solved Go to solution
Highlighted

multiple versions of python issues

Contributor

I have two versions on python installed (2.6 and 2.7) Spark jobs run thru shell in pyspark are picking up one version of Python (2.7). Jobs submitted to the cluster via yarn are picking up the 2.6 version of python. How can I get yarn jobs to point to the 2.7 version?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: multiple versions of python issues

@Jon Page Try these before running spark-submit command:

export PYSPARK_DRIVER_PYTHON=/opt/anaconda2/bin/python

export PYSPARK_PYTHON=/opt/anaconda2/bin/python

/opt/anaconda2/bin/python should be the location of your 2.7 python (this should be same across all clsuter nodes)

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

2 REPLIES 2

Re: multiple versions of python issues

@Jon Page Try these before running spark-submit command:

export PYSPARK_DRIVER_PYTHON=/opt/anaconda2/bin/python

export PYSPARK_PYTHON=/opt/anaconda2/bin/python

/opt/anaconda2/bin/python should be the location of your 2.7 python (this should be same across all clsuter nodes)

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Re: multiple versions of python issues

Contributor

Thanks, this did work for me!

Is there a way to configure the hadoop cluster to use a specific installed version of python?

Don't have an account?
Coming from Hortonworks? Activate your account here