We are currently using Spark 1.6 on CDH 5.10 platform. We are currently upgrading from python 2.7 to python 3.6 using anaconda distribution. While i try to do spark-submit in client mode the process is failing giving below error -
File "/apps/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 381, in namedtuple
TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
We are not very clear about the cause of the failure. We have checked Spark documentation and it says that Spark 1.6.0 is compatible with python 3.0+.
Any thoughts or suggestions on this would be helpful ?
Thanks, Agreed. I also found the bug details.
Based on the URL https://spark.apache.org/docs/1.6.0/#downloading you shared, it contains details which says it is compatible with 2.6+ and 3.1+ which is totally misleading since 3.6 is 3.1+
I have started working to upgrade my app to spark 2. Any suggestiosn on Spark 1.6 to Spark 2 migration guide on Cloudera cluster