Reply
Highlighted
New Contributor
Posts: 8
Registered: ‎07-16-2017
Accepted Solution

Configuring CDH cluster with Python 3

Hi All,

 

We are using CDH 5.8.3 community version and we want to add support for Python 3.5+ to our cluster since our research algos need Python 3.5+ in order to run their spark jobs successfully.

 

I know that Cloudera and Anaconda has such parcel to support Python, but this parcel support Python version 2.7.

 

What is the recommended way to enable Python version 3+ on CDH cluster?

 

Best,

 

Eyal

Contributor
Posts: 52
Registered: ‎11-03-2016

Re: Configuring CDH cluster with Python 3

New Contributor
Posts: 8
Registered: ‎07-16-2017

Re: Configuring CDH cluster with Python 3

Isn't there a better option like the Cloudera-Anconda parcel which can be managed using CM?

New Contributor
Posts: 8
Registered: ‎07-16-2017

Re: Configuring CDH cluster with Python 3

Hi All,

 

Additional thoughts on the question I asked?

 

Best,

 

Eyal

New Contributor
Posts: 8
Registered: ‎07-16-2017

Re: Configuring CDH cluster with Python 3

Hi Divyani,

 

Is the solution you offered the best one (the post you shared is from Sep 2015)?

 

Best,

 

Eyal

Contributor
Posts: 52
Registered: ‎11-03-2016

Re: Configuring CDH cluster with Python 3

Hi Eyal,

 

 

Did you check this post from cloudera for anaconda parcel:

 

http://blog.cloudera.com/blog/2016/02/making-python-on-apache-hadoop-easier-with-anaconda-and-cdh/

 

 

 

New Contributor
Posts: 4
Registered: ‎07-16-2017

Re: Configuring CDH cluster with Python 3

Hi Divyani,

 

Thanks for the link you shared but again the Anaconda parcel only comes with Python 2.7 and I need Python 3.5.

 

So if I want to enable Python 3.5 in the cluster what are the best recommanded methods?

What if I want to enable multiple Python versions in the cluster and enable each app to run with its own Python version?

 

Best, Eyal

Cloudera Employee
Posts: 126
Registered: ‎03-01-2016

Re: Configuring CDH cluster with Python 3

Hi 

 

Continuum ships Anaconda parcel and Cloudera does not have control on which python version it installs.

 

Please use the OS package management tool to install python 3.5 on the servers in the CDH cluster, once that is done, please follow this doc to set python for your pyspark job:

 

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/spark_python.html#spark_python__secti...

New Contributor
Posts: 1
Registered: ‎05-06-2019

Re: Configuring CDH cluster with Python 3

Can you give some steps and instructions to install Python3.5 or Anaconda package in the CDH cluster? By using the parcel way is not working as expected, the parcel shows the message, distributed, activated, but it is not with python3.5, it is still using python2.7. Please let me know if there is any document to install manually anaconda with python3.5 to cdh cluster through the command line. 

New Contributor
Posts: 8
Registered: ‎07-16-2017

Re: Configuring CDH cluster with Python 3

Hi MKay,

As mentioned in my previous posts the Anaconda parcel for CDH comes only
with Python 2.7 and I could find a free way to get a parcel with Python 3+.

We ended up manually installing the different Python versions we needed by
keeping
different virtual envs for different Python versions.

We executed the following procedure to install python 3.5:

yum install python-pip
curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
python get-pip.py
pip install virtualenv
yum install -y https://centos7.iuscommunity.org/ius-release.rpm
yum install -y python35u python35u-libs python35u-devel python35u-pip
mkdir -p /opt/venv35
cd /opt/venv35
virtualenv venv35 -p python3.5
source venv35/bin/activate

Best, Eyal