Member since
10-17-2022
17
Posts
3
Kudos Received
0
Solutions
04-01-2025
12:40 AM
Hi Team, Below are the basic steps that I am following to upgrade Nifi from version 1.23.2 to version 2.3.0 on my RHEL 9 server: unzip nifi newer version: unzip nifi-2.3.0-bin.zip copying flow files from older to newer version: cp nifi-1.23.2/conf/flow.* nifi-2.3.0/conf/ copying state directory from older to newer version: cp -r nifi-1.23.2/state nifi-2.3.0/state/ update nifi properties for sensitive prop key: nifi.sensitive.props.key=<<same as older version>> start nifi: sh nifi-2.3.0/bin/nifi.sh start Problem Statement: I am getting the Nifi UI console for newer verion (Nifi-2.3.0). All the processGroups Or Processors those were in older verion copied to new version as well. But few processors (like joltTransformation, publishKafka and etc), it still showing that those are using 1.23.2 nar file. Troubleshooting steps, but did not work: Though I have deleted the work directory and revalidated that lib is not using any nar file of older version. Delete work directory: rm -rf nifi-2.3.0/work/* Delete lib directory with 1.23.2 in case if it has: rm -rf nifi-2.3.0/lib/*.1.23.2*.nar restart nifi service: sh nifi-2.3.0/bin/nifi.sh restart I even manually changed the version in flow.json and flow.xml file from 1.23.2 to 2.3.0. Doing this the libbrary is 2.3.0 now, but in invalid state. Can you please mention the steps that I am missing here while upgrade from 1.23.2 to 2.3.0? Why not all the processoris getting upgraded as expected?
... View more
Labels:
- Labels:
-
Apache NiFi
03-21-2025
12:16 PM
@PriyankaMondal Looking at your error message log, I can see you're experiencing authentication timeouts with ConsumeIMAP and ConsumePOP3 processors when connecting to Microsoft Office 365 services. Possible Blockers Timeout Issue The primary error is "Read timed out" during authentication, which suggests the connection to Office 365 is being established but then timing out during the OAUTH2 handshake. Microsoft 365 Specific Considerations Microsoft has specific requirements for modern authentication with mail services and has been deprecating basic authentication methods. Processor Configuration Using OAUTH2 authentication mode, is correct for Office 365, but there may be issues with the token acquisition or timeout settings. Possible solutions 1. Check timeout settings # Add these properties to your processor configuration mail.imap.connectiontimeout=60000 mail.imap.timeout=60000 mail.pop3.connectiontimeout=60000 .pop3.timeout=60000 2. Verify Modern Authentication settings Ensure the account has Modern Authentication enabled in Microsoft 365 Admin Center Verify the application registration in Azure AD has the correct permissions IMAP.AccessAsUser.All for IMAP POP.AccessAsUser.All for POP3 offline_access scope for refresh tokens 3. Update NiFi Mail Libraries NiFi's default JavaMail implementation might have compatibility issues with Office 365. Try: Updating to the latest version of NiFi (if possible) Or add Microsoft's MSAL (Microsoft Authentication Library) JAR to NiFi's lib directory 4. Use a Custom SSL Context Service Microsoft servers might require specific TLS settings # Create a Standard SSL Context Service with: Protocol: TLS # Add to Advanced Client settings for the processor 5. Alternative Approach: Use Microsoft Graph API Since Microsoft is moving away from direct IMAP/POP3 access, consider: Using InvokeHTTP processor to authenticate against Microsoft Graph API Use the Graph API endpoints to retrieve email content 6. Check Proxy Settings If your environment uses proxies # Add these properties mail.imap.proxy.host=your-proxy-host mail.imap.proxy.port=your-proxy-port mail.pop3.proxy.host=your-proxy-host mail.pop3.proxy.port=your-proxy-port 7. Implementation Steps Update the processor configuration with extended timeout values Verify OAuth2 settings in the processor match exactly with your Azure application registration Check Microsoft 365 account settings to ensure IMAP/POP3 is enabled with Modern Authentication Consider implementing a token debugging flow using InvokeHTTP to validate token acquisition separately Happy hadooping
... View more
08-06-2024
04:35 AM
@PriyankaMondal wrote: Hi Team, I want to achive the below mentioned transformation in Nifi. using any processor. Please help me to get this done. my lowes life sample input: { "date": "35 days 11:13:10.88", "key1": "value1", "keyToBeMapped1": "hostname.com", "key2": "value2", "key3": "value3", "key4": "value4", "keyToBeMapped2": "High Paging Rate", "key5": "PAGING", "keyToBeMapped3": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run", "Entity OID": "keyToBeMapped1", "Parameter": "keyToBeMapped2", "Description": "keyToBeMapped3" } Expected Output: { "date": "35 days 11:13:10.88", "key1": "value1", "keyToBeMapped1": "hostname.com", "key2": "value2", "key3": "value3", "key4": "value4", "keyToBeMapped2": "High Paging Rate", "key5": "PAGING", "keyToBeMapped3": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run", "Entity OID": "hostname.com", "Parameter": "High Paging Rate", "Description": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run" } Regards, Priyanka Hello, You can achieve this transformation in NiFi using the JoltTransformJSON processor. Jolt is a JSON-to-JSON transformation library that allows you to specify transformations in a declarative way. Here’s how you can set it up: Steps to Configure JoltTransformJSON Processor Add the JoltTransformJSON Processor: Drag and drop the JoltTransformJSON processor onto your NiFi canvas. Configure the Processor: Double-click the processor to open its configuration dialog. Go to the Properties tab. Set the Jolt Specification: In the Jolt Specification property, you will define the transformation rules. Here’s the Jolt spec you need for your transformation: [
{
"operation": "shift",
"spec": {
"date": "date",
"key1": "key1",
"keyToBeMapped1": "keyToBeMapped1",
"key2": "key2",
"key3": "key3",
"key4": "key4",
"keyToBeMapped2": "keyToBeMapped2",
"key5": "key5",
"keyToBeMapped3": "keyToBeMapped3",
"Entity OID": "@(1,keyToBeMapped1)",
"Parameter": "@(1,keyToBeMapped2)",
"Description": "@(1,keyToBeMapped3)"
}
}
] Apply the Configuration: Click Apply to save the configuration. Connect the Processor: Connect the JoltTransformJSON processor to the next processor in your flow. Explanation of the Jolt Specification The operation is set to shift, which means we are mapping fields from the input JSON to the output JSON. The spec defines the mapping rules. For example, "Entity OID": "@(1,keyToBeMapped1)" means that the value of keyToBeMapped1 should be assigned to Entity OID in the output JSON. Example Flow GenerateFlowFile (to simulate input JSON) JoltTransformJSON (with the above specification) LogAttribute (to log the transformed JSON) This setup should transform your input JSON to the expected output format. Hope this will help you. Best regards, florence023
... View more
07-21-2024
09:48 PM
Hi Team, I am using HandleHttpRequest to receive data in Nifi (version 1.23.2), and number of nodes in a Nifi cluster is 3. Now, there are possibilities of receiving data on load-balancer which is distributing load in round-robin way. My requirement is to count the total number of events in a min on the cluster. How to achieve that? My current flow: haldleHttpRequest --> MergeRecord (bin age is 1 min) --> calculateRecordStats (to get count) But my current flow is counting the number of events received by a node in a min not the total number of received data in a cluster in 1 min. Please suggest. Regards, Priyanka Mondal
... View more
Labels:
- Labels:
-
Apache NiFi
07-17-2024
10:20 AM
1 Kudo
@PriyankaMondal In version of Apache NiFi older then 1.16, NiFi does not allow any edits within the NiFi cluster while a node is disconnected. Changes are only allowed on the actual disconnected node. In Apache NiFi 1.16.0 NiFi introduced a new flow inheritance feature that allowed joining nodes with an existing flow.xml.gz/flow.json.gz that does not match the cluster elected flow to join the cluster by inheriting the cluster elected flow. A joining node would only be blocked from this process if the inheritance of the cluster flow would result in dataloss (meaning the joining node's flow contains a connection holding queued FlowFiles and the cluster elected flow does not have that connection). Later it was determined that this change can make it difficult handle the outcome of above issue. https://issues.apache.org/jira/browse/NIFI-11333 So it was decided that the best course of action was not allow any component deletion while a node is disconnected. When a NiFi node is started it attempts to join that node to the cluster. If the nodes fails to join the cluster, it shuts back down to avoid users from mistakenly using it as a standalone node. That means user had no easy way to handle the queued data in connection preventing the rejoin. Of course users could configure the node to come up standalone, but that does not make things any easier on the end user. The node loads up standalone, loads its FlowFiles and depending in whether auto.resume was set or not, start processing FlowFiles. This still leaves the user with FlowFiles queued in many connection all throughout the UI would have a very difficult time determining which connection(s) were removed and need to be processed out in order to rejoin the cluster. So decision was made to stop allowing deletion when a node is disconnected. That being said, when a NIFi cluster has a disconnected node, users can decide to navigate to the cluster UI and drop the disconnected node(s) from the cluster. The cluster will now have full functionality again as it will report all existing nodes as connected. It will require a restart of the dropped node(s) to get them to attempt to connect to the cluster again. But keep in mind that when it attempts to join cluster and inherit the cluster flow you may run into the problem described above. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-20-2024
02:23 AM
1 Kudo
@PriyankaMondal, Did any of the responses assist in resolving your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future.
... View more
02-01-2024
07:09 AM
1 Kudo
@PriyankaMondal I am not clear by you statement: if Nifi processor (any processor within a process group) stops suddenly due to load/any other issue You are saying you see a NiFi processor transition to a stopped state unexpectedly? This should never happen. Or are you saying the processor seems to dtop processing FlowFiles even though it is currently in a running/started state? NiFi queues FlowFiles in connection between processor components. A FlowFile is not removed from the inbound connection to a processor component until that FlowFile has been successfully processed by the consuming processor. The FlowFile consist of two parts: 1. FlowFile attributes/metadata that is persisted in the NiFi flowfile_repository. 2. FlowFile content persisted within claims inside the content_repository. To protect from data loss these repositories should be using protected storage such as RAID. So if NiFi were to suddenly crash or server itself crash, when NiFi is restarted on that down node it will load its flow and then load the FlowFile back in to the connections. Processing will begin again against those FlowFiles by downstream processor component. NiFi's design favors data duplication over data loss ir order to avoid data loss posibilities. For example: Let's assume that a NiFi processor completed execution against a FlowFile resulting in writing something out to an external endpoint. in response to that successful operation the processor would then move the FlowFile from the inbound connection to some a downstream relationship. If NiFi were to crash in that very moment before the FlowFile was moved, on startup the same FlowFile would load in the inbound connection and get processed again. Also keep in mind that you are running 3 node NiFi cluster and within a NiFi cluster each connected node runs its own copy of the flow, its own set of repositories, and its own local state. So each node is unaware of the FlowFiles being processed by another node in the same cluster. Generally speaking when you have a processor that shows active threads indicator on it and zeroed out stats, you either have a very long running thread or a hung thread (only examination of serious of thread dumps can make the determination. Most commonly this is a resource utilization problem. But could also be dataflow design issue, client library issue, or network issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
01-29-2024
05:44 AM
@PriyankaMondal Just to add to What @ckumar provided, the NiFi repositories are not locked to the specific node. What i mean by that is that they can be moved to a new node, withe "new" being the key word there. A typical prod NiFi setup will use protected storage for its flowfile_repository and content_repository(s) which hold all the FlowFile metadata and FlowFile content for all actively queued and archived FlowFiles on a node. To prevent loss of data, these repositories should be protected through the use of RAID storage or some other equivalent protected storage. The data stored in these repositories is tightly coupled to the flow.xml.gz/flow.json.gz that cluster is running on every node. Let's say you have hardware failure, it may be faster to standup a new server then repair the existing hardware failure. You can simple move or copy the protected repositories to the new node before starting it. When the node starts and joins your existing cluster it will inherit the cluster's flow.xml.gz/flow.json.gz and then begin loading the FlowFiles from those moved repositories in to the connection queues. Processing will continue exactly where it left off on the old node. There is no way to merge repositories together, so you can not add the contents of one nodes repositories to the already existing repositories of another node. The provenance_repository holds lineage data, and the database_repository holds flow configuration history and some node specific info. Neither of these are needed to preserve the actual FlowFiles. Hope this helps, Matt
... View more
01-03-2024
08:15 AM
@PriyankaMondal What is being logged in the nifi-user.log when the issue happens? Have you tried using your browser's developer tools to look at the data being exchanged in the request with the NiFi cluster? Feels like maybe the site cookies are not being sent to the NiFi node after successful authentication resulting in the exception being seen. Thanks, Matt
... View more
10-06-2023
09:01 AM
@PriyankaMondal 1. Not clear on the question here. Why use Toolkit to create three keystores? I thought you were getting three certificated (one for each node) from your IT team. Use those to create the three unique keystores you will use. 2. It appears your DN has a wildcard in it. NiFi does not support the use of wildcards in the DN of node ClientAuth certificates. This is because NiFi utilizes mutualTLS connections and the clientAuth DN is used to identify the unique connecting clients and is used to setup and configure the authorizations. Now you could ask your IT team to create you one keystore with a non wildcard DN like "cn=nifi-cluster, ou=domainlabs, DC=com" and add all three of your Nifi node's hostnames as SAN entries in that one PrivateKeyEntry. This would allow you to use that same PrivateKey keystore on all three NiFi nodes. This has downsides liek security. If keystore on one node gets compromised, all hosts are compromised because it is reused. All nodes will present as same client identity (since all present same DN) during authorization. So nothing will distinguish one node from the other. The keystore used by NiFi can ONLY contain one privateKey entry. Merging multiple keystores with privateKey entries will result in one keystore with more than one PrivateKeyEntry which is not supported by NiFi. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more