we found a little Error in /opt/cloudera/parcels/SPARK2/lib/spark2/python/pyspark/ml/util.py there is a "import warnings" line missing
This issue was fixed a few month ago in Spark:
line 223 of the Same file could need some hash.
#Remove the last package name "pipeline" for Pipeline and PipelineModel.
The fix for SPARK-19506 is incorporated in Cloudera Distribution of Apache Spark 2.2 release 1
For a full list of fixed issues, see here
For downloading Cloudera Distribution of Apache Spark 2.2 release 1, see here
So this actualy means, that we have to choose either a broken PySpark 2.1 or to Upgrade the JDK to 1.8, yeah
Yes, JDK 8 is a requirement for Spark 2.2 (which has the fix for SPARK-19506).