Member since
09-17-2019
5
Posts
0
Kudos Received
0
Solutions
06-07-2021
01:53 AM
We have CDH 6.1.1. When I run pyspark on command line, it uses version 3 $ pyspark Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 21/06/07 10:47:44 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered! Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.4.0-cdh6.1.1 /_/ When I run oozie workflow with single saprk action. I printed the python version it uses inside spark and I found it is using python version 2.7.5 Python version
2.7.5 (default, Feb 20 2018, 09:19:12)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
Version info.
sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0) Following the part of python code I used in the python: import sys
print("Python version")
print (sys.version)
print("Version info.")
print (sys.version_info)
# In[ ]:
print("printing the values:")
print(sys.argv)
# print("Nominal Time:" + sys.argv[-1])
# print("Data Dependency:" + sys.argv[-2]) How can I change the python version used by oozie in spark actions to use python 3?
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Spark