Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi/DataFlow example that loops through a list?

Solved Go to solution
Highlighted

Nifi/DataFlow example that loops through a list?

Super Collaborator

I'm a total dataflow/nifi rookie.

I'm trying to accomplish something like the following:

Given a database table like this

Customer_ID (varchar), DoA (boolean), DoB (boolean), DoC (boolean)

I want to:

1) query the table (select *)

2) for each customer:

3a) if DoA, execute some steps (move some files around, etc)

3b) if DoB, execute some steps

3c) if DoC, execute some steps

4) Update some logs files, etc.

I've been playing with some of the example templates here: https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates

But I haven't found anything to show me how to accomplish step 2 above.

Is it possible to work through a loop like this?

In the nifi training class, the instructor said that this is a common use case, but I can't seem to find a template that looks like this.

Can someone point me at an example to get me going?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Nifi/DataFlow example that loops through a list?

If I'm understanding the scenario correctly, you would probably do something like this...

- ExecuteSQL/QueryDatabaseTable to get data from the database, produces Avro

- ConvertAvroToJSON or ConvertAvroToCSV, I'm going to use JSON going forward

- SplitJSON to split each record into its own flow file

- EvaluateJSONPath to extract DoA, DoB, and DoC into flow file attributes

From here it kind of depends the logic you want to happen and whether those three fields are mutually exclusive (only one is ever true) or if 2 out of 3 can be true, but you would use RouteOnAttribute with a property like DoA = ${DoA:equals("true")} to send everything that matches that to that relationship, and then send that relationship to the processors you want to perform the logic when DoA is true.

You could have a series of RouteOnAttribute processors, or you could have one with complex statements like:

${DoA:equals("true"):and( ${DoB:equals("false")} )}

You can take a look at the expression language guide for more detail on constructing the right expressions:

https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

View solution in original post

4 REPLIES 4
Highlighted

Re: Nifi/DataFlow example that loops through a list?

If I'm understanding the scenario correctly, you would probably do something like this...

- ExecuteSQL/QueryDatabaseTable to get data from the database, produces Avro

- ConvertAvroToJSON or ConvertAvroToCSV, I'm going to use JSON going forward

- SplitJSON to split each record into its own flow file

- EvaluateJSONPath to extract DoA, DoB, and DoC into flow file attributes

From here it kind of depends the logic you want to happen and whether those three fields are mutually exclusive (only one is ever true) or if 2 out of 3 can be true, but you would use RouteOnAttribute with a property like DoA = ${DoA:equals("true")} to send everything that matches that to that relationship, and then send that relationship to the processors you want to perform the logic when DoA is true.

You could have a series of RouteOnAttribute processors, or you could have one with complex statements like:

${DoA:equals("true"):and( ${DoB:equals("false")} )}

You can take a look at the expression language guide for more detail on constructing the right expressions:

https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

View solution in original post

Highlighted

Re: Nifi/DataFlow example that loops through a list?

Super Collaborator

Thanks Bryan,

I'll give this a try!

Highlighted

Re: Nifi/DataFlow example that loops through a list?

Super Collaborator

The key here for me is a shift in thinking.

The SplitJSON processor "splits" my flow into X flows, based on the results of my query. And then I can run Y of them at a time. It's not a quite a loop (unless Y == 1), but it makes sense now.

Highlighted

Re: Nifi/DataFlow example that loops through a list?

Super Collaborator

(I posted another nifi question here if anyone reading this has an answer: https://community.hortonworks.com/questions/56616/options-for-exporting-large-data-sets-from-hive-to...

Don't have an account?
Coming from Hortonworks? Activate your account here