Support Questions

Find answers, ask questions, and share your expertise

[Nifi] Modular Process Group for Notifications

avatar

We are looking to implement a single modular notification process group that expects various attributes (to, from, host, message, etc) on input and then uses those attributes to send out email and chat notifications. The thought is that every process group (regardless of hierarchy location) that needs to send out notifications will use this process group. It will be developed and maintained in one place to minimize scope of changes. Is this possible? So far based on my limited experience it appears that the use of process groups, input ports and output ports only works if this proposed notification process group was a child of each process group that needs to send an email? That would mean that we would have to maintain a notification process group (possibly a template) for each process group that needs to send notifications rather than just one. Is it possible to use remote process groups that actually link to this internal process group?

See sample flow hierarchy below to better understand a theoretical hierarchy and architecture...

  • Sample process group & hierarchy:
    • Notification process group: root->utilities->notify
      • Description: Modular Process Group for sending notifications
  • Load process group #1: root->marketing->sales_data_loader
    • Description: Loads Sales data
    • Utilizes notification process group to send success and failure notifications
  • Load process group #2: root->finance->assets_data_loader
    • Description: Loads financial asset data
    • Utilizes notification process group to send success and failure notifications

Thanks

1 ACCEPTED SOLUTION

avatar
Guru

One approach is to build the notification process group as a single reusable asset that is checked out from a repository and implemented in new flows at whatever hierarchy level and whatever number of times used among all flows. As such, it is change-managed and and preconfigured with properties that are dynamic and ready to go for each environment. Additionally, configurations can be modified for each instance it is used in a flow. See the following link:

https://community.hortonworks.com/articles/60868/enterprise-nifi-implementing-reusable-components-a....

Let me know if this is possibly what you are looking for; else, follow up with additional requirements/needs.

View solution in original post

4 REPLIES 4

avatar
Guru

One approach is to build the notification process group as a single reusable asset that is checked out from a repository and implemented in new flows at whatever hierarchy level and whatever number of times used among all flows. As such, it is change-managed and and preconfigured with properties that are dynamic and ready to go for each environment. Additionally, configurations can be modified for each instance it is used in a flow. See the following link:

https://community.hortonworks.com/articles/60868/enterprise-nifi-implementing-reusable-components-a....

Let me know if this is possibly what you are looking for; else, follow up with additional requirements/needs.

avatar

Wow, great post thank you for sharing @Greg Keys. My goal is to architect things so that our development is very modular, re-usable and configuration driven. While I don't think this solves the question the way I was originally thinking it does provide a great (maybe better) solution with some added information that I was not aware of AND gets me closer to my goals. Couple questions...

  1. Where are templates stored that they can be pulled out and checked into git? I know the flows are in flow.xml.gz. When changes happen to templates they will still need to be manually propagated throughout all the flows that use it, correct?
    1. Edit: looks like templates can be downloaded via UI or API.
  2. How do you go about or recommend checking in the flows and process groups? Just pushing the flow.xml.gz into git on a regular basis?
  3. I had no clue you could utilize OS Environment Variables and custom properties files. That will prove to be very useful especially as a csv file with a list of properties files. What is meant by "system properties"? Is this just the values in the nifi.properties file?
  4. Does nifi need to be bounced to pull in new properties from nifi.variable.registry.properties?
    1. Edit: Tested and appears you need to bounce Nifi of the new properties to be picked up
  5. Are there system/flow level properties like process group name, process group id, etc? Wondering if properties can be set in custom files specific to the process group name.
  6. Will an upload template api call of an existing template just replace the existing template?

I need to read through this a bit more to completely digest it and the links provided but I believe it gets us going in the right direction. Thanks much!

avatar
Guru

Answers to questions:

  1. Templates are stored on the NiFi cluster where they are saved or uploaded. See article visuals on how to access them. Correct ... down/uploaded via UI or API. (I believe the community is working on a central repository for templates, accessible by all UI instances ... in the works however)
  2. It is similar to code. Keep in mind when discussing templates you should think of them as either reusable assets (checked out to be used across the team or enterprise for reuse) and full flows (checked out after passing testing, to be promoted to new envt, e.g QA to UAT or Prod). See diagram in SDLC section of article.
  3. From reading NiFi literature, I think system and OS properties are splitting hairs around properties we can retrieve directly from the OS (system property like line separator in java, OS property like what you set with export)
  4. Good question ... will test and update this answer.
  5. No, but you could handle this by giving your property names namespaces for a process group, etc. E.g. you could prepend the process group name or last 10 digits of uid in front of property name. E.g 034g345d2.filepath, 034g345d2.threshold and 034f423ee1.filepath, 034f423ee1.threshold to specify the same properties per process group.
  6. No, that is the really powerful part about this ... as the article states, each processor, process group or connection has a UUID with the first part a global id and the second part an instance id. When you download the template, the instance id is replaced by all 0s. When you upload to the canvas, the instance id is given a unique sequence. In that way you can reuse these templates as many times and at as many hierarchy levels as you want. (Works the same way with copy-paste ... you can copy any processor, or connections of processors (subflow), or process group and paste into flows as many times as you wish. Works because each paste creates a new instance id for each component).

avatar

This appears to be exactly what I am looking for, though I am not sure if it is currently being worked or targeted for a certain release - https://cwiki.apache.org/confluence/display/NIFI/Reference-able+Process+Groups