Member since
07-16-2017
8
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
15650 | 05-07-2019 02:26 AM |
05-07-2019
02:26 AM
2 Kudos
Hi MKay, As mentioned in my previous posts the Anaconda parcel for CDH comes only with Python 2.7 and I could find a free way to get a parcel with Python 3+. We ended up manually installing the different Python versions we needed by keeping different virtual envs for different Python versions. We executed the following procedure to install python 3.5: yum install python-pip curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py" python get-pip.py pip install virtualenv yum install -y https://centos7.iuscommunity.org/ius-release.rpm yum install -y python35u python35u-libs python35u-devel python35u-pip mkdir -p /opt/venv35 cd /opt/venv35 virtualenv venv35 -p python3.5 source venv35/bin/activate Best, Eyal
... View more
02-01-2018
03:13 AM
Hi Divyani, Is the solution you offered the best one (the post you shared is from Sep 2015)? Best, Eyal
... View more
01-09-2018
12:55 AM
Isn't there a better option like the Cloudera-Anconda parcel which can be managed using CM?
... View more
01-07-2018
05:23 AM
Hi Eric, We followed Cloudera's link to how to set up a LB using F5 and still had to increase the session timeout to 12 hours to stop long processes from failing, are we missing anything? Do you know of other companies that implemented the LB using F5 and had similar issues? Best, Eyal
... View more
01-07-2018
05:13 AM
Hi All, We are using CDH 5.8.3 community version and we want to add support for Python 3.5+ to our cluster since our research algos need Python 3.5+ in order to run their spark jobs successfully. I know that Cloudera and Anaconda has such parcel to support Python, but this parcel support Python version 2.7. What is the recommended way to enable Python version 3+ on CDH cluster? Best, Eyal
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Manager
12-17-2017
01:06 AM
Hi Eric, In order to setup the proxy for HS2 using F5 I followed the exact steps described in the following Cloudera "Impala HA with F5 BIG-IP" manual: http://www.cloudera.com/documentation/other/reference-architecture/PDF/Impala-HA-with-F5-BIG-IP.pdf I known the above manual is for setting up Proxy for Impala but I don't see how for HS2 the steps should change. If this is an offical Cloudera manual, how come only when increaseing the F5 proxy timeout itself to 12H (from 1H) both Impala & HS2 session running long queries (longer than 1H) stopped being killed? Was the above procedure made against a specific F5 version? Best, Eyal
... View more
12-10-2017
08:52 AM
Hi All, We have recently configured an F5 proxy for our HiveServer2 services in order to support increasing scale in clients accessing HS2 service, following this procedure: http://www.cloudera.com/documentation/other/reference-architecture/PDF/Impala-HA-with-F5-BIG-IP.pdf We started with session timeout of 1 hour but we quickly found out that for long running Hive queries the proxy is killing the connection once the timeout is reached. I assumed that Hive connections using beeline\PyHive\Cloudera ODBC\etc. are aware of the fact that Hive processes are usually long, hence, should implement a keep alive mechanism to keep the connection active until the Hive process finish. To my surprise none of the Hive clients we are using, implements such a keep alive mechanism and only when we increased the proxy's session timeout to be longer than our longest Hive query, our long Hive processes stopped being killed by the proxy. Digging a bit deeper to HS2 configuration I found the parameter hive.server2.idle.session.timeout which is set to 12 hours, and I understood why before using the proxy all our Hive processes worked perfectly. Our network guys said that setting session timeout at the proxy level to be 12 hours is not best practice and that the clients accessing HS2 should implement keep alive mechanism. Is there a better way addressing this keep alive issue? Or setting the proxy's session timeout to be bigger than the longest query is the way to go? Best, Eyal
... View more
Labels:
- Labels:
-
Apache Hive