Support Questions
Find answers, ask questions, and share your expertise

How to use Spark engine with Falcon


I am using HDP 2.4, Spark 1.6.2.

I've recently installed Falcon and I was able to deploy the primary and backup clusters. I've also successfully run a mirror job.

Now I'm working on scheduling a spark app. When I want to create a process, I am only able to choose from Oozie, Pig and Hive. I am not able to select Spark as an engine. When I try to add it using XML the spark-attributes get cleared.

I am using an xml like below

<process xmlns='uri:falcon:process:0.1' name='spark-process'>
    <cluster name='primaryCluster'>
      <validity start='2017-07-03T00:00Z' end='2017-07-05T00:00Z'/>
  <workflow engine="spark" path="/app/spark"/>
        <name>Test Spark Wordcount</name>
        <spark-opts>--num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1</spark-opts>
    </spark-attributes>  <retry policy='periodic' delay='minutes(3)' attempts='3'/>
  <ACL owner='ambari-qa' group='users' permission='0755'/>

Is there something I need to do before using Spark with Falcon or is this functionality not supported with these component versions?

See screenshots to visualise the issue




Just found this error in the falcon.application.log

ERROR - [1388728910@qtp-1886491834-668 - c186eb8d-ef42-42f1-be4b-076e6ee27a5c:ambari-qa:POST//entities/submit/process] ~ Action failed: Bad Request Error: javax.xml.bind.UnmarshalException - with linked exception: [org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 383; cvc-enumeration-valid: Value 'spark' is not facet-valid with respect to enumeration '[oozie, pig, hive]'. It must be a value from the enumeration.] (FalconWebException:83)

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.