Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi count and content as Attribute

Nifi count and content as Attribute

New Contributor

How do I get the count with the data into an attribute in Nifi? Is it possible to do that? If so pls guide me. TIA

4 REPLIES 4

Re: Nifi count and content as Attribute

New Contributor

@ Matt Burgess

Re: Nifi count and content as Attribute

Super Guru
@Pramod Kalvala

Could you please add more details about your question like are you expecting count number of line in the flowfile?

Re: Nifi count and content as Attribute

New Contributor

What I meant was, I want to convert the content in a flowfile to an attribute.

Re: Nifi count and content as Attribute

Super Guru

@Pramod Kalvala

In NiFi we are having Count Text processor which will adds the number of lines,non empty lines,characters in the text file.

Count text processor write Attributes:-

NameDescription
text.line.countThe number of lines of text present in the FlowFile content
text.line.nonempty.countThe number of lines of text (with at least one non-whitespace character) present in the original FlowFile
text.word.countThe number of words present in the original FlowFile
text.character.countThe number of characters (given the specified character encoding) present in the original FlowFile

Example:-

If you are having content of the flowfile as below and we are having empty line as second line in the flowfile.

64834-input.png

Once we feed this content to the Count text processor having below configs:-

Count Lines

true

Count Non-Empty Lines

true

Count Words

true

Count Characters

true

Split Words on Symbols

true

Output Flowfile Attributes:-

64833-counttext-attributes.png

count text processor has been added line.count,nonempty lines count, character count to the flowfile.

(or)

By using ExecuteStream command processor we can run wc -l command to get the number of lines in the text document.

(or)

By using query record processor to get lines in the flowfile content

Useful links for Query record processor

https://community.hortonworks.com/articles/140183/counting-lines-in-text-files-with-nifi.html

https://community.hortonworks.com/articles/146096/counting-lines-in-text-files-with-nifi-part-2.html

If you are using QueryDatabase table,execute sql processors then we will have row.count attribute associated with the output flowfile from the which will give the number of rows has been fetched from the source.

To Convert Content as Flowfile Attribute:-

for this use case we can use Extract text processor to extract the content and store as flowfile attribute

Extract text Configs:-

64835-extracttext.png

Add new property with the regex (.*) i.e capture all the content and keep the content as flowfile attribute name data.

change the Enable DOTALL Mode to true if your flowfile content having new lines in it.

Most important properties are

Maximum Buffer Size1 MBSpecifies the maximum amount of data to buffer (per file) in order to apply the regular expressions. Files larger than the specified maximum will not be fully evaluated.
Maximum Capture Group Length1024Specifies the maximum number of characters a given capture group value can have. Any characters beyond the max will be truncated.

You have to increase these properties values in order of your flowfile size to get all the content of the flow file into attribute.

It's not recommended to extract all the contents and keep them as attributes, as the attributes are kept in-memory.

please refer to below link for nifi best practices and deeper

https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#DeeperView

https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#best-practice

Don't have an account?
Coming from Hortonworks? Activate your account here