Member since
09-11-2024
4
Posts
3
Kudos Received
0
Solutions
08-01-2025
06:39 AM
@Krish98 When you secure NiFi (HTTPS enabled), in the TLS exchange NiFi will either REQUIRE (if no additional methods of authentication are configured) or WANT (when additional method of authentication are configured, like SAML) a clientAuth certificate. This is necessary for NiFi clusters to work. Even when one node communicates with another, the nodes to be authenticated (done via a mutual TLS exchange) and authorized (authorizing those clientAuth certificates to necessary NiFi policies). When accessing the NiFi UI, a MutualTLS exchange happens with your browser (client). If the browser does not respond with a clientAuth certificate, NiFi will attempt next configured auth method, it your case that would be SAML. MutualTLS with trusted ClientAuth certificates removes the need to obtain any tokens, renew tokens, and simplifies automation tasks with the rest-api whether interacting via NiFi built dataflows or via external interactions with the NiFi rest-api. The ClientAuth certificate DN is what is used as the user identity (final user identity that needs to be authorized is derived from the DN post any Identity Mapping Properties manipulation). Just like your SAML user identities, your clientAuth certificate derived user identity needs to be authorized to whichever NiFi policies are needed for the requested rest-api endpoint. Tailing the nifi-user.log while making your rest-api calls will show you the derived user identity and missing policy when request is not authorized. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
12-23-2024
06:25 AM
1 Kudo
@Krish98 Not enough details here to determine what is going on. Do all your FlowFile that are being merged contain the required FlowFile attributes needed for this to be successful? Name Description fragment.identifier Applicable only if the <Merge Strategy> property is set to Defragment. All FlowFiles with the same value for this attribute will be bundled together. fragment.index Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates the order in which the fragments should be assembled. This attribute must be present on all FlowFiles when using the Defragment Merge Strategy and must be a unique (i.e., unique across all FlowFiles that have the same value for the "fragment.identifier" attribute) integer between 0 and the value of the fragment.count attribute. If two or more FlowFiles have the same value for the "fragment.identifier" attribute and the same value for the "fragment.index" attribute, the first FlowFile processed will be accepted and subsequent FlowFiles will not be accepted into the Bin. fragment.count Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates how many FlowFiles should be expected in the given bundle. At least one FlowFile must have this attribute in the bundle. If multiple FlowFiles contain the "fragment.count" attribute in a given bundle, all must have the same value. segment.original.filename Applicable only if the <Merge Strategy> property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute will be used for the filename of the completed merged FlowFile. Are the FlowFile attribute values correct? How are the individual FlowFile fragments being produced? I find it odd that you say 1 FlowFile is routed to "merged". This implies the a merge was success. Are you sure the "merged" FlowFile only contains the content of one FlowFile from the set of fragments? Any chance you routed both the "original" and "failure" relationships to the same connection? Can you share the full MergeContent processor configuration (all tabs)? When a FlowFile is routed to "failure", I would expect to see logging in the nifi-app.log related to reason for the failure. What do you see in the nifi-app.log? How many unique bundles are you trying to merge concurrently? I see you have set "Max Num Bins" to 200. So you expect to have 200 unique fragment.identifier bundles to merge at one time? How many FlowFiles typically make up one bundle? I also see you have "Max Bin Age" set to 300 sec (5 mins). Are all FlowFiles with same fragment.identifier making it to the MergeContent processor within the 5 minute of one another? Keep in mind that the MergeContent has the potential to consume a lot of NiFi heap memory depending on how it is configured. FlowFile Attributes/metadata is held in heap memory. FlowFiles that are allocated to MergeContent bins have there FlowFiles held in heap memory. So depending on how many FlowFiles make up a typical bundle, the number of bundles being concurrently handled (200 max), and the amount + size of the individual FlowFile attributes, binned FlowFils can consume a lot of heap memory. Do you encounter any out of memory errors in yoru nifi-app.log (might not be thrown by mergeContent). You must be mindful of this when designing a dataflow that uses MergeContent processors. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
10-10-2024
09:54 AM
2 Kudos
@Krish98 Most NiFi Heap memory issues are directly related to dataflow design. The Apache NiFi documentation for the individual components generally does a good job with reporting "System Resource Considerations". So the first step would be to review the documentation for the components you are using to see which list "MEMORY" as system resource consideration. Example: SplitContent 1.27.0 Then sharing your configuration of those components might help with providing suggestions that may help you. - Split and Merge processor depending on how they are configured can utilize a lot of heap. - Distributed Map cache also resides in HEAP and can contribute to to significant heap usage depending on configuration and the size of what is being written to it. Beyond components: - NiFi loads the entire flow.json.gz (uncompressed it to heap memory). This includes any NiFi Templates (Deprecated in Apache NiFi 1.x and removed in newer Apache NiFi 2.x version). Templates should no longer be used. Any templates created which are listed in the NiFi templates UI should be downloaded so they are stored outside of NiFi and then deleted from NiFi to reduce heap usage. - NiFi FlowFiles - NiFi FlowFlowFiles are what transition between components via connections in your dataflow(s). A FlowFile consists of two parts. FlowFile content stored in content claims in the content_repository and FlowFile metadata/attributes held in heap memory and persisted to flowfile_repository. So if you are creating a lot of FlowFile attributes on your FlowFiles or creating very large FlowFile attributes (like extract content to an attribute), that can result in high heap usage. A connection does have a default threshold at which time a swap file is created to reduce heap usage. Swap files are created with 10,000 FlowFiles in each swap file. The first swap file would not be created until a connection on a specific node reached 20,000 at which point 10,000 would be moved to a swap file and the 10,000 highest priority would remain in heap. The default "back pressure object threshold" on a connection is 10,000 meaning that with defaults no connection would ever create a swap file. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-13-2024
03:07 AM
1 Kudo
@Krish98, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @SAMSAL @MattWho who may be able to assist you further. Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
... View more