- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to aggregate csv or json data in Nifi?
- Labels:
-
Apache NiFi
Created ‎08-16-2017 09:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How can i aggregate my csv data in nifi? Can you please give me options.
I have csv coming every hour.
For example my csv data is:
Name, Subject
stud1, math
stud2, english
stud3, math
stud4, math
stud5, english
stud6, science
My needed output is:
math, 3
english, 2
science, 1
Can i make this without using any execute scripts?
Thank you.
Created ‎08-16-2017 01:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think you can do that with the QueryRecord processor. Add a new property, whatever the name and the content may be something like "select Subject, count(Name) as nb from flowfile group by Subject".
Set correctly the schemas for the reader and the writer and you will find the result of the query as a new relationship of your processor. (A new relationship with the name of your property will appear)
Created ‎08-16-2017 01:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I think you can do that with the QueryRecord processor. Add a new property, whatever the name and the content may be something like "select Subject, count(Name) as nb from flowfile group by Subject".
Set correctly the schemas for the reader and the writer and you will find the result of the query as a new relationship of your processor. (A new relationship with the name of your property will appear)
Created ‎08-17-2017 03:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It worked! thank you so much.
can this query record make aggregation in real time manner?
like ill accept data every second then go save it in hbase, but i always should update the data in hbase.
