Created 03-31-2016 04:34 AM
Is there anywhere a full example of a pyspark workflow with oozie? I found examples for java spark workflows but I am not sure how to transpose them with HDP and pyspark.
Created 03-31-2016 04:50 AM
Oozie Spark action is available in the community, Hortonworks does not provide support for spark action in HDP 2.4 or below. As soon as it's available, there will be examples of pyspark in Oozie.
Created 03-31-2016 04:50 AM
Oozie Spark action is available in the community, Hortonworks does not provide support for spark action in HDP 2.4 or below. As soon as it's available, there will be examples of pyspark in Oozie.
Created 04-01-2016 01:13 AM
I did not get any errors in a job with this http://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html but it certainly is not obvious how do you use it for pyspark.
Created 02-16-2017 06:00 PM
@Erik Putrycz I added a pyspark workflow example https://github.com/dbist/oozie/tree/master/apps/pyspark it works in HA HDFS, RM HA, OOZIE HA, kerberos.
Created 02-16-2017 09:07 PM
@Erik Putrycz additionally, I added a tutorial here https://community.hortonworks.com/articles/84071/apache-ambari-workflow-manager-view-for-apache-ooz-...
Created 04-01-2016 04:48 AM
@Erik Putrycz To use the pyspark , you need to copy the python file to your hdfs and specify the hdfs path of python file in the <jar> tag
"<jar>${nameNode}/user/ambari-qa/examples/apps/spark/lib/pi.py</jar>"
Also you need to export the SPARK_HOME in your hadoop-env.sh