Support Questions

Find answers, ask questions, and share your expertise

Flowfile immutability

avatar
Expert Contributor

NiFi docs mention this:

While the contents and attributes of a FlowFile can change, the FlowFile object is immutable.

How is flowfile immutable if its contents and attributes can change?

3 REPLIES 3

avatar
Master Mentor

@manishg 

A NiFi FlowFile consists of two parts:
1. FlowFile Content - FlowFile Content is stored within a content claim inside the NiFi content_repository.  Once content is written to a claim it can not be modified.
2. FlowFile Attributes/Metadata - FlowFile Attributes/Metadata is stored within the flowfile_repository.  FlowFile attributes/metadata contain various key/value pairs about the FlowFile (these may be attributes/metadata created by NiFi on all FlowFiles like filename, date, location of content with content_repository, etc. or Attributes added later via NiFi processors.)  The FlowFile attributes can be modified.

When a FlowFile is created, its content is written to content claim.  Within a NiFi data flow you may have processors that modify the content of a FlowFile.  Depending on the processor, the modification of the content can result in two outcomes (both of which result in new content being written to a new content claim.). 
1. The processor writes the new content to a new content claim and the FlowFile attributes/metadata is updated to reference that new claim going forward.
2. If the processor has an "original" relationship, the original FlowFile is sent to this relationship while any produced new FlowFiles derived from that original are created and routed to another outbound relationship. 

It is also possible within your dataflow that you may have a single FlowFile that you duplicate, such as routing the same success outbound relationship twice.  Anytime a FlowFile is duplicated, NiFi creates a clone of the FlowFile.  Both FlowFiles are unique; however, both are l pointing at the same content claim in the content_repository.  NiFi tracks claimant counts on all content_repository content.  For every FlowFile pointing at a content claim, the claimant count is incremented.  As each FlowFile pointing at a content claim reach a point of termination in the dataflow, the claimant count is decremented.  Only content claims for which the claimant count is zero can be archived and eventually purged from the content_repository.

Hope this helps clarify the lifecycle of a NiFi FlowFile.

If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Expert Contributor

So if I understand correctly, as a flowfile progresses through various processors in data flow and processors make changes in the content, content corresponding to each processor is saved in a content claim in content repository and is immutable. So basically each flow file has a series of content claims corresponding to it.

Metadata for a flowfile is mutable, and keeps on changing as flowfile progresses in the flow.

avatar
Master Mentor

Correct.  A FlowFile might over its dataflow lifetime point at different content claims for its content.  That all depends on the processors used in the dataflow.  

If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt