Created on 11-29-201704:59 PM - edited 08-17-201910:05 AM
There are many ways to integrate Apache NiFi and Apache Spark.
We can call Apache Spark Streaming via S2S (Apache NiFi's Site to Site) or Kafka. If you want to execute a regular Apache Spark job, you can do that via Apache Livy which is included in HDP 2.6+. This is how Apache Zeppelin integrates with Apache Spark, so it's secure and a reasonable approach.
I use this approach when I want to use Spark to process part of my process in the middle of an Apache NiFi flow.
Syntax for Calling a Job
This job is stored in HDFS as /apps/logs*jar with the class name com.dataflowdeveloper.logs.Logs.