Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Creating a dynamic data flow using apache NIFI

Creating a dynamic data flow using apache NIFI

New Contributor

I need to create a data flow through apache NIFI.I have a input text file containing comma delimited rows exported from a RDBMS table. Different output text files will be created based on filter conditions on the data(Ex :value of the first column is greater than 5) and the rows matching the output conditions will be moved to a separate output directory. I have implemented a static version of this using GetFile,RouteText,Put File processors in the NIFI GUI. But now we have to implement it in a way such that the conditions and corresponding will be entered by the user via a GUI and will be captured in a JSON file. So our process will have one input file containing the data, another JSON file with the filter conditions and corresponding target files. Since the number of output files and filter conditions are decided dynamically so I could not think of a way to create the flow in the NIFI GUI.Is it possible to create the flow using GUI? I am new to this tool hence don't have much idea how to generate processors dynamically. Do I need to use some programming in Groovy/Javascript/Java?

It will be very helpful if someone can suggest a high-level approach that I can follow to get the result?

9 REPLIES 9

Re: Creating a dynamic data flow using apache NIFI

Expert Contributor

The below might be helpful as an approach:

https://community.hortonworks.com/articles/3160/update-nifi-flow-on-the-fly-via-api.html

In the example, groovy is used to dynamically update properties in NiFi using the rest API.

Re: Creating a dynamic data flow using apache NIFI

New Contributor

Actually the problem I am facing is that the number of targets to be used will be decided at runtime. How do I decide the number of PutFile/PutHDFS processor while designing the flow via GUI.How I am going to handle that part?

Re: Creating a dynamic data flow using apache NIFI

New Contributor

Hi

From your question I am understanding that you are trying to achieve getFile ---> Route line on some parameters ---> PutFile/PutHdfs

If I am understanding your question correctly, you can do the following

1. Since you are reading the properties from a file for routing the lines, you will have to take help of the rest apis of Nifi for updating the processors. For Nifi APIs, look here.

2. In the putfile or put hdfs, in the directory property provide the path something like this /tmp/route/${RouteText.Route:substringBefore('.')} This will read the RouteText.Route attribute and updates the path dynamically.

With point 2 all you will be requiring is one single putFile or putHdfs processor, unless your paths are going to vary a lot.

Attached a template : route-line-to-folder.xml

Let me know if this works.

Re: Creating a dynamic data flow using apache NIFI

New Contributor

This approach is fine expect for the case since we do not know how many targets we have beforehand. That will be decided at run time and the information will be available in a JSON file. So how many PutFile/PutHDFS processors do I create in the GUI.As far as I understood we can only put one path in a single PutFile/PutHDFS processor.

Re: Creating a dynamic data flow using apache NIFI

New Contributor

Updating the Nifi flow you have to do that using rest APIs

If you use /tmp/route/${RouteText.Route:substringBefore('.')} in the directory of putFIle, then only one putFIle processor is required.

Assuming you have 2 routes. route1 and route2. When your flow file takes route1 the expression ${RouteText.Route:substringBefore('.')} evaluates to route1 resulting in your directory path to be /tmp/route/route1.

Highlighted

Re: Creating a dynamic data flow using apache NIFI

New Contributor

I created a flow as shown below:

6489-8hukd.png

I configured the RouteText processor as shown below:

6490-lntje.png

Finally in the PutFile processor I added the expression

"/user/tmp/${RouteText.Route:substringBefore('.')"

But I cannot find any output file in the paths /user/tmp/TEST1 and /user/tmp/TEST2.But when I create the same flow with 2 PutFile processor and hardcode the Directories it is working fine.Am I missing something here?

Re: Creating a dynamic data flow using apache NIFI

New Contributor

Hey

Just tried the same and is working fine for me. Can you check what is your termination conditions?

Looks like the flowfile lines are not satisfying any of the input conditions. Can you try putting the unmatched and original conditions to different PutFile processors?

Re: Creating a dynamic data flow using apache NIFI

New Contributor

But that is not possible in my scenario because I won't be having the conditions and number of putFile processor.So how can I put multiple putFile processors.Can you please post a image of your flow?That will help me to understand if your case is same.

Re: Creating a dynamic data flow using apache NIFI

New Contributor

Please find attached the template file route-line-to-folder.xml. That is the flow I am using.

Please refer to this as to how to import a template.

Hope this helps you