- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Configuring CDH cluster with Python 3
- Labels:
-
Apache Spark
-
Cloudera Manager
Created on ‎01-07-2018 05:13 AM - edited ‎09-16-2022 05:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
We are using CDH 5.8.3 community version and we want to add support for Python 3.5+ to our cluster since our research algos need Python 3.5+ in order to run their spark jobs successfully.
I know that Cloudera and Anaconda has such parcel to support Python, but this parcel support Python version 2.7.
What is the recommended way to enable Python version 3+ on CDH cluster?
Best,
Eyal
Created ‎05-07-2019 02:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As mentioned in my previous posts the Anaconda parcel for CDH comes only
with Python 2.7 and I could find a free way to get a parcel with Python 3+.
We ended up manually installing the different Python versions we needed by
keeping
different virtual envs for different Python versions.
We executed the following procedure to install python 3.5:
yum install python-pip
curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
python get-pip.py
pip install virtualenv
yum install -y https://centos7.iuscommunity.org/ius-release.rpm
yum install -y python35u python35u-libs python35u-devel python35u-pip
mkdir -p /opt/venv35
cd /opt/venv35
virtualenv venv35 -p python3.5
source venv35/bin/activate
Best, Eyal
Created ‎01-08-2018 10:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
You can use the following link:
https://www.anaconda.com/blog/developer-blog/python-35-support-anaconda/
Created ‎01-09-2018 12:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Isn't there a better option like the Cloudera-Anconda parcel which can be managed using CM?
Created ‎01-14-2018 01:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
Additional thoughts on the question I asked?
Best,
Eyal
Created ‎02-01-2018 03:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Divyani,
Is the solution you offered the best one (the post you shared is from Sep 2015)?
Best,
Eyal
Created ‎02-04-2018 11:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Eyal,
Did you check this post from cloudera for anaconda parcel:
http://blog.cloudera.com/blog/2016/02/making-python-on-apache-hadoop-easier-with-anaconda-and-cdh/
Created ‎10-17-2018 01:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Divyani,
Thanks for the link you shared but again the Anaconda parcel only comes with Python 2.7 and I need Python 3.5.
So if I want to enable Python 3.5 in the cluster what are the best recommanded methods?
What if I want to enable multiple Python versions in the cluster and enable each app to run with its own Python version?
Best, Eyal
Created ‎10-17-2018 11:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Continuum ships Anaconda parcel and Cloudera does not have control on which python version it installs.
Please use the OS package management tool to install python 3.5 on the servers in the CDH cluster, once that is done, please follow this doc to set python for your pyspark job:
Created ‎05-06-2019 12:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you give some steps and instructions to install Python3.5 or Anaconda package in the CDH cluster? By using the parcel way is not working as expected, the parcel shows the message, distributed, activated, but it is not with python3.5, it is still using python2.7. Please let me know if there is any document to install manually anaconda with python3.5 to cdh cluster through the command line.
Created ‎05-07-2019 02:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As mentioned in my previous posts the Anaconda parcel for CDH comes only
with Python 2.7 and I could find a free way to get a parcel with Python 3+.
We ended up manually installing the different Python versions we needed by
keeping
different virtual envs for different Python versions.
We executed the following procedure to install python 3.5:
yum install python-pip
curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
python get-pip.py
pip install virtualenv
yum install -y https://centos7.iuscommunity.org/ius-release.rpm
yum install -y python35u python35u-libs python35u-devel python35u-pip
mkdir -p /opt/venv35
cd /opt/venv35
virtualenv venv35 -p python3.5
source venv35/bin/activate
Best, Eyal
