Created on 01-30-2018 08:00 PM - edited 09-16-2022 05:48 AM
Hi,
Is it possible to use pyspark client in centos7 (python 2.7) with a yarn cluster HDP 2.5 in centos 6 (python 2.6) ?
Best Regards
Gerald
Created 01-31-2018 08:41 AM
I guess this is not possible. If you have two different versions of spark then application will fail with exception "Exception: Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set."
You can also refer this question : https://community.hortonworks.com/questions/101952/zeppelin-pyspark-cannot-run-with-different-minor-...
Created 01-31-2018 09:36 AM
Thank's for your answer. My problem is one client needs to use OrientDB and pyorient connector. The issue is that the version of pyorient isn't compatible with python 2.6, so we can't integrate pyorient in python prog.
Best Regards
Gérald
Created 01-31-2018 09:44 AM
Is it feasible to install python2.7 on your centos6 cluster ?
If you can install python2.7, then modify spark-env.sh to use python2.7 by changing below properties :
export PYSPARK_PYTHON=<path to python 2.7> export PYSPARK_DRIVER_PYTHON=python2.7
Steps for changing spark-env.sh :
1) Login to ambari
2) Navigate to spark service
3) Under 'Advanced spark2-env' modify 'content' to add properties as described above.
Attaching screenshot.spark-changes.png
Created 01-31-2018 10:10 AM
Thank's for your answer.
If i do it, what are the impacts from other HDP services using python6 ?
Best Regards
Gérald