How do I get the count with the data into an attribute in Nifi? Is it possible to do that? If so pls guide me. TIA
In NiFi we are having Count Text processor which will adds the number of lines,non empty lines,characters in the text file.
Count text processor write Attributes:-
|text.line.count||The number of lines of text present in the FlowFile content|
|text.line.nonempty.count||The number of lines of text (with at least one non-whitespace character) present in the original FlowFile|
|text.word.count||The number of words present in the original FlowFile|
|text.character.count||The number of characters (given the specified character encoding) present in the original FlowFile|
If you are having content of the flowfile as below and we are having empty line as second line in the flowfile.
Once we feed this content to the Count text processor having below configs:-
Count Non-Empty Lines
Split Words on Symbols
Output Flowfile Attributes:-
count text processor has been added line.count,nonempty lines count, character count to the flowfile.
By using ExecuteStream command processor we can run wc -l command to get the number of lines in the text document.
By using query record processor to get lines in the flowfile content
Useful links for Query record processor
If you are using QueryDatabase table,execute sql processors then we will have row.count attribute associated with the output flowfile from the which will give the number of rows has been fetched from the source.
To Convert Content as Flowfile Attribute:-
for this use case we can use Extract text processor to extract the content and store as flowfile attribute
Extract text Configs:-
Add new property with the regex (.*) i.e capture all the content and keep the content as flowfile attribute name data.
change the Enable DOTALL Mode to true if your flowfile content having new lines in it.
Most important properties are
|Maximum Buffer Size||1 MB||Specifies the maximum amount of data to buffer (per file) in order to apply the regular expressions. Files larger than the specified maximum will not be fully evaluated.|
|Maximum Capture Group Length||1024||Specifies the maximum number of characters a given capture group value can have. Any characters beyond the max will be truncated.|
You have to increase these properties values in order of your flowfile size to get all the content of the flow file into attribute.
It's not recommended to extract all the contents and keep them as attributes, as the attributes are kept in-memory.
please refer to below link for nifi best practices and deeper