08-31-2018 03:27 PM
Its been a whole week, i have been sturggling with default spark 1.6 , I have trained a model using pyspark ML but i am not able save the model, main reason is that save method is missing from library due to older version.
Then i tried to updgrade it using right instructions, installation gets completed it goes to good health. But the conf folder for spark 2 is always empty i tried to put required stuff there but it doesn't help. I have a deadline coming if any one could help me asap ? please
08-31-2018 08:23 PM
...but the conf folder for spark 2 is always empty
The symptoms you have shared indicates that the node from where you're trying to run spark2 binaries doesn't have a gateway role. I am assuming that you are using Cloudera Manager to manage your CDH cluster(?) If yes, please see the documentation Step 5b which requires that we configure a gateway role on the host(s) (usually edge node) from where we plan to launch spark2 binaries (like spark2-shell, spark2-submit, pyspark2).
Once you'd added a gateway role, redeploy client configuration which will ensure that conf directory for spark2 is populated with all the required configurations and xml files.
# alternatives --display spark2-conf spark2-conf - status is auto. link currently points to /etc/spark2/conf.cloudera.spark2_on_yarn /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/etc/spark2/conf.dist - priority 10 /etc/spark2/conf.cloudera.spark2_on_yarn - priority 51 Current `best' version is /etc/spark2/conf.cloudera.spark2_on_yarn.
Please do note that Spark2.2 and above requires JDK8, this is stated here in addition to other prerequisites.
Let us know how it goes. Good Luck!