Support Questions

Find answers, ask questions, and share your expertise

How to use NiFi to launch a Oozie workflow job?

avatar
Expert Contributor

Please help me understand the best practices around launching a Oozie workflow job by using NiFi. For instance, once I write data into HDFS by using a PutHDFS processor, I assume I should use an ExecuteScript processor to explicitly invoke a Oozie workflow job, either via Java client API/REST API, or Oozie CLI depending if NiFi instance resides inside the HDP cluster or not.

Thanks,

Derek

1 ACCEPTED SOLUTION

avatar
Master Mentor

though you can achieve what you're planning it really goes against what both these products are designed to do. Oozie uses coordinator and bundle to schedule their workflows, Nifi is about always flowing data without start and finish. Oozie does have rest API you can invoke to start a workflow if you intend to go your route but I would first ask what user is trying to do. If you need a way to check whether data landed in HDFS before executing Oozie workflow, look at the following coordinator examples by Yahoo, specifically https://github.com/yahoo/oozie/wiki/Oozie-Coord-Use-Cases#triggering-coordinator-jobs-when-data-dire...

Also suggest you read my articles https://community.hortonworks.com/articles/85354/apache-ambari-workflow-manager-view-for-apache-ooz-... and

https://community.hortonworks.com/articles/85361/apache-ambari-workflow-manager-view-for-apache-ooz-...

View solution in original post

6 REPLIES 6

avatar
Master Mentor

though you can achieve what you're planning it really goes against what both these products are designed to do. Oozie uses coordinator and bundle to schedule their workflows, Nifi is about always flowing data without start and finish. Oozie does have rest API you can invoke to start a workflow if you intend to go your route but I would first ask what user is trying to do. If you need a way to check whether data landed in HDFS before executing Oozie workflow, look at the following coordinator examples by Yahoo, specifically https://github.com/yahoo/oozie/wiki/Oozie-Coord-Use-Cases#triggering-coordinator-jobs-when-data-dire...

Also suggest you read my articles https://community.hortonworks.com/articles/85354/apache-ambari-workflow-manager-view-for-apache-ooz-... and

https://community.hortonworks.com/articles/85361/apache-ambari-workflow-manager-view-for-apache-ooz-...

avatar
Expert Contributor

I was just trying to understand if there is any native NiFi & Oozie integration out of the box, such as NiFi & Flume, guess not. Thanks Artem.

avatar
Master Mentor

You can certainly achieve that, or contribute a patch but it defeats the purpose. If you need more explanation, ping me on hipchat. Don't forget to accept the answer!

avatar
Expert Contributor

Will chat more on hipchat if needed. Thanks.

avatar
Explorer

Hello,

I got the same remark/question/proposal that i would expose differently :

==> OOZIE may suffer after a certain number of workflows, if the triger is Data availabilty based.

The Question (or answer) is :

When using NIFI to ingest data, and instead of puting a "SUCCESS" flag to trigger OOZIE WFL, why not lauch the WFL trought a NIFI InvokHTTP Processor simply ?

( and routing failures to a 2nd InvokHttp or loop to the same to relaunch during a period (expire attribute link) in case of network pb for example).

I tested is and it's wotks fine. The question is there any against indications ?

Best.

avatar

@adelgacem can you share sample for how you invoked Oozie using InvokHttp processor? I'm trying to run Oozie workflow from NIFI it will be a great help if you can share NIFI processor configuration details.