Created 08-26-2021 02:58 PM
I will use Spark2 in CDP and need to install Python3. Do I need to installation Python3 on every node in the CDP cluster, just only need to install it on one particular node?
Spark2 job is executed in JVM containers that could be created on any worker node. I wonder whether the container is created upon a template? If yes, then how the template is created and where is it?
Thanks.
Created 08-30-2021 11:24 PM
Hi @Seaport
Yes it is required when if you want to run Python UDFs or do something outside spark SQL operations in your application.
If you are just using the Spark SQL API there’s no runtime requirement for python.
If you are going to install Spark3, please check below supported versions:
CDS Powered by Apache Spark requires one of the following Python versions:
Note: Spark 2.4 is not compatible with Python 3.8. The latest version recommended is Python 3.4+ (https://spark.apache.org/docs/2.4.0/#downloading). The Apache Jira SPARK-29536 related to Python 3.8 is fixed in Spark3.
Created 08-30-2021 11:24 PM
Hi @Seaport
Yes it is required when if you want to run Python UDFs or do something outside spark SQL operations in your application.
If you are just using the Spark SQL API there’s no runtime requirement for python.
If you are going to install Spark3, please check below supported versions:
CDS Powered by Apache Spark requires one of the following Python versions:
Note: Spark 2.4 is not compatible with Python 3.8. The latest version recommended is Python 3.4+ (https://spark.apache.org/docs/2.4.0/#downloading). The Apache Jira SPARK-29536 related to Python 3.8 is fixed in Spark3.
Created 09-03-2021 09:26 AM
@Seaport Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,Created 09-03-2021 05:08 PM
Vidya,
Thanks for your reply. Could you help me clarify the issue further? Does Spark (or other MapReduce tool) create the container using the local host as its template (to some degree)?
Created 09-14-2021 11:38 PM
Hi @Seaport
As you know, resource managers like yarn, standalone, kubernets will create containers. Internally RMs will use shell script to create containers. Based on resources, it will create one or more containers in the same node.