Support Questions
Find answers, ask questions, and share your expertise

NiFi: Persist flowfile attributes

NiFi: Persist flowfile attributes

Rising Star

The following are set of attributes that gets logged when using the LogAttribute processor for every flowfile. I have few concerns with this approach:

1) Involves frequent disk write

2) Extensive information (like complete attribute names and the literals like "----------------------test----------------------", "Standard FlowFile Attributes", "Key:","Value:" e.t.c

3) Not structured in a record OR perhaps JSON style for easier analysis

4) Standard FlowFile Attributes are included by default regardless of their inclusion/exclusion in the above processor properties

Do you have any suggestions on the following:

1) Persist only the essential attributes in structure preferred, for e.g.

{key1:value1,key2:value2}

2) Persist only at a regular intervals, perhaps the merge of multiple flowfiles (using MergeContent processor), rather that writing every sec and each flowfiles separately.

3) To be able to split the process flow separately, without having the subsequent process to wait for persisting the attributes in the log, for e.g.

consumekafka __> mergecontent __> logattributes

|

| __> convertavrotojson ........

Sample app log:

----------------------test----------------------

Standard FlowFile Attributes

Key: 'entryDate'

Value: 'Sat Dec 10 00:44:09 EST 2016'

Key: 'lineageStartDate'

Value: 'Sat Dec 10 00:44:09 EST 2016'

Key: 'fileSize'

Value: '245027'

FlowFile Attribute Map Content

Key: 'kafka.offset'

Value: '10557'

Key: 'kafka.partition'

Value: '0'

1 REPLY 1

Re: NiFi: Persist flowfile attributes

Kumar, the LogAttribute processor is just that, a simple log.

If you need any customized handling of attributes, formats, data redacting, etc, tap into a stream and process it as regular dataflow.