About PriyankaMondal

MattWho · ‎09-03-2025

@PriyankaMondal There significant differences between the Apache NiFi 1.x and Apache NiFi 2.x major release. Deprecated and removed processors, controller services, and reporting task components Some components moved to new nars Deprecated and removed NiFi Templates Deprecated and removed NiFi Variable registry. This means that you can not simply move your flow.json.gz from Apache NiFi 1.23.2 to Apache NiFi 2.x. First you should update your dataflows so they are no longer using any of the deprecated components in Apache NiFi 1.x. I recommend first upgrading to Apache NiFi 1.28 so you have the latest deprecation logging. Apache NiFi 1.28 should produce a deprecation log that will tell you all the deprecated components you are currently using on your dataflow canvas. Take steps to remove these components or replace them with alternative components that are still available in Apache NiFi 2.x. Examples: JoltTransformJson processor was included in the nifi-standard-nar in Apache NiFi 1.x , but has moved to a nifi-jolt-nar on Apache NiFi 2.x. So the class name has changed. You'll need to add the Apache NiFi 2.x class version of JoltTransformJson processor to canvas and reconfigure it for your dataflow and delete the ghosted (dashed line around it because NiFi does not know that class) JoltTransformJson processor. While you manually changed the version in yoru flow.json.gz, you did not change the class path as needed resulting in the processor still being ghosted. ConvertAvroToJson processor was deprecated in NiFi 1.x and removed in Apache NiFi 2.x. Would need to replace it with a convertRecord (available in Apache NIFi 1.x and 2.x) configured to use an Avro Reader and Json Writer. NiFi Variable Registry removed. So if you are using any NiFi variables in your processor configuration in Apache NiFi 1.x, you'll need to modify your dataflows to use NiFi parameters instead (Parameters exist in Apache NiFi1.x and 2.x) Templates were deprecate in NiFi 1.x and were replaced with Flow Definitions. Templates removed in NiFi 2.x. Would need to remove all templates saved in NiFi before moving to Apache NiFi 2.x. Above is just a short list. refer to the deprecation log produced by Apache NiFi 1.28 to see all deprecated processor you may have been using in yoru dataflows. I do wish there was an easier way to move from Apache NiFi 1.x to 2.x, but depending on your use of deprecated features and changed component classes, there may be little too a lot of effort needed to prepare yoru NiFi 1.x datfalows for migration to NiFi 2.x For Cloudera Flow Management license holders: For Cloudera Flow Management NiFi users, Cloudera has built a Cloudera Flow Management Migration Tool that automated many of the migration steps (swapping processors when alternatives exist, changing processor classes to new classes, converting templates to flow definitions, converting NiFi variables to NiFi parameters, etc. While there is still no direct upgrade possible from Cloudera Flow Management 2.1.7 (Apache NiFi 1.x based) to Cloudera Flow Management 4.10 (Apache NIFi 2.x based), this migration tool takes a lot of the manual work out of preparing your flow.json.gz for the new major release. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Shelton · ‎03-21-2025

@PriyankaMondal Looking at your error message log, I can see you're experiencing authentication timeouts with ConsumeIMAP and ConsumePOP3 processors when connecting to Microsoft Office 365 services. Possible Blockers Timeout Issue The primary error is "Read timed out" during authentication, which suggests the connection to Office 365 is being established but then timing out during the OAUTH2 handshake. Microsoft 365 Specific Considerations Microsoft has specific requirements for modern authentication with mail services and has been deprecating basic authentication methods. Processor Configuration Using OAUTH2 authentication mode, is correct for Office 365, but there may be issues with the token acquisition or timeout settings. Possible solutions 1. Check timeout settings # Add these properties to your processor configuration mail.imap.connectiontimeout=60000 mail.imap.timeout=60000 mail.pop3.connectiontimeout=60000 .pop3.timeout=60000 2. Verify Modern Authentication settings Ensure the account has Modern Authentication enabled in Microsoft 365 Admin Center Verify the application registration in Azure AD has the correct permissions IMAP.AccessAsUser.All for IMAP POP.AccessAsUser.All for POP3 offline_access scope for refresh tokens 3. Update NiFi Mail Libraries NiFi's default JavaMail implementation might have compatibility issues with Office 365. Try: Updating to the latest version of NiFi (if possible) Or add Microsoft's MSAL (Microsoft Authentication Library) JAR to NiFi's lib directory 4. Use a Custom SSL Context Service Microsoft servers might require specific TLS settings # Create a Standard SSL Context Service with: Protocol: TLS # Add to Advanced Client settings for the processor 5. Alternative Approach: Use Microsoft Graph API Since Microsoft is moving away from direct IMAP/POP3 access, consider: Using InvokeHTTP processor to authenticate against Microsoft Graph API Use the Graph API endpoints to retrieve email content 6. Check Proxy Settings If your environment uses proxies # Add these properties mail.imap.proxy.host=your-proxy-host mail.imap.proxy.port=your-proxy-port mail.pop3.proxy.host=your-proxy-host mail.pop3.proxy.port=your-proxy-port 7. Implementation Steps Update the processor configuration with extended timeout values Verify OAuth2 settings in the processor match exactly with your Azure application registration Check Microsoft 365 account settings to ensure IMAP/POP3 is enabled with Modern Authentication Consider implementing a token debugging flow using InvokeHTTP to validate token acquisition separately Happy hadooping

florence0239 · ‎08-06-2024

@PriyankaMondal wrote: Hi Team, I want to achive the below mentioned transformation in Nifi. using any processor. Please help me to get this done. my lowes life sample input: { "date": "35 days 11:13:10.88", "key1": "value1", "keyToBeMapped1": "hostname.com", "key2": "value2", "key3": "value3", "key4": "value4", "keyToBeMapped2": "High Paging Rate", "key5": "PAGING", "keyToBeMapped3": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run", "Entity OID": "keyToBeMapped1", "Parameter": "keyToBeMapped2", "Description": "keyToBeMapped3" } Expected Output: { "date": "35 days 11:13:10.88", "key1": "value1", "keyToBeMapped1": "hostname.com", "key2": "value2", "key3": "value3", "key4": "value4", "keyToBeMapped2": "High Paging Rate", "key5": "PAGING", "keyToBeMapped3": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run", "Entity OID": "hostname.com", "Parameter": "High Paging Rate", "Description": "A high paging activity has been detected on host abc.lab.com. This could mean that too many processes are being run" } Regards, Priyanka Hello, You can achieve this transformation in NiFi using the JoltTransformJSON processor. Jolt is a JSON-to-JSON transformation library that allows you to specify transformations in a declarative way. Here’s how you can set it up: Steps to Configure JoltTransformJSON Processor Add the JoltTransformJSON Processor: Drag and drop the JoltTransformJSON processor onto your NiFi canvas. Configure the Processor: Double-click the processor to open its configuration dialog. Go to the Properties tab. Set the Jolt Specification: In the Jolt Specification property, you will define the transformation rules. Here’s the Jolt spec you need for your transformation: [ { "operation": "shift", "spec": { "date": "date", "key1": "key1", "keyToBeMapped1": "keyToBeMapped1", "key2": "key2", "key3": "key3", "key4": "key4", "keyToBeMapped2": "keyToBeMapped2", "key5": "key5", "keyToBeMapped3": "keyToBeMapped3", "Entity OID": "@(1,keyToBeMapped1)", "Parameter": "@(1,keyToBeMapped2)", "Description": "@(1,keyToBeMapped3)" } } ] Apply the Configuration: Click Apply to save the configuration. Connect the Processor: Connect the JoltTransformJSON processor to the next processor in your flow. Explanation of the Jolt Specification The operation is set to shift, which means we are mapping fields from the input JSON to the output JSON. The spec defines the mapping rules. For example, "Entity OID": "@(1,keyToBeMapped1)" means that the value of keyToBeMapped1 should be assigned to Entity OID in the output JSON. Example Flow GenerateFlowFile (to simulate input JSON) JoltTransformJSON (with the above specification) LogAttribute (to log the transformed JSON) This setup should transform your input JSON to the expected output format. Hope this will help you. Best regards, florence023

PriyankaMondal · ‎07-21-2024

Hi Team, I am using HandleHttpRequest to receive data in Nifi (version 1.23.2), and number of nodes in a Nifi cluster is 3. Now, there are possibilities of receiving data on load-balancer which is distributing load in round-robin way. My requirement is to count the total number of events in a min on the cluster. How to achieve that? My current flow: haldleHttpRequest --> MergeRecord (bin age is 1 min) --> calculateRecordStats (to get count) But my current flow is counting the number of events received by a node in a min not the total number of received data in a cluster in 1 min. Please suggest. Regards, Priyanka Mondal

MattWho · ‎07-17-2024

@PriyankaMondal In version of Apache NiFi older then 1.16, NiFi does not allow any edits within the NiFi cluster while a node is disconnected. Changes are only allowed on the actual disconnected node. In Apache NiFi 1.16.0 NiFi introduced a new flow inheritance feature that allowed joining nodes with an existing flow.xml.gz/flow.json.gz that does not match the cluster elected flow to join the cluster by inheriting the cluster elected flow. A joining node would only be blocked from this process if the inheritance of the cluster flow would result in dataloss (meaning the joining node's flow contains a connection holding queued FlowFiles and the cluster elected flow does not have that connection). Later it was determined that this change can make it difficult handle the outcome of above issue. https://issues.apache.org/jira/browse/NIFI-11333 So it was decided that the best course of action was not allow any component deletion while a node is disconnected. When a NiFi node is started it attempts to join that node to the cluster. If the nodes fails to join the cluster, it shuts back down to avoid users from mistakenly using it as a standalone node. That means user had no easy way to handle the queued data in connection preventing the rejoin. Of course users could configure the node to come up standalone, but that does not make things any easier on the end user. The node loads up standalone, loads its FlowFiles and depending in whether auto.resume was set or not, start processing FlowFiles. This still leaves the user with FlowFiles queued in many connection all throughout the UI would have a very difficult time determining which connection(s) were removed and need to be processed out in order to rejoin the cluster. So decision was made to stop allowing deletion when a node is disconnected. That being said, when a NIFi cluster has a disconnected node, users can decide to navigate to the cluster UI and drop the disconnected node(s) from the cluster. The cluster will now have full functionality again as it will report all existing nodes as connected. It will require a restart of the dropped node(s) to get them to attempt to connect to the cluster again. But keep in mind that when it attempts to join cluster and inherit the cluster flow you may run into the problem described above. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

VidyaSargur · ‎02-20-2024

@PriyankaMondal, Did any of the responses assist in resolving your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future.

MattWho · ‎02-01-2024

@PriyankaMondal I am not clear by you statement: if Nifi processor (any processor within a process group) stops suddenly due to load/any other issue You are saying you see a NiFi processor transition to a stopped state unexpectedly? This should never happen. Or are you saying the processor seems to dtop processing FlowFiles even though it is currently in a running/started state? NiFi queues FlowFiles in connection between processor components. A FlowFile is not removed from the inbound connection to a processor component until that FlowFile has been successfully processed by the consuming processor. The FlowFile consist of two parts: 1. FlowFile attributes/metadata that is persisted in the NiFi flowfile_repository. 2. FlowFile content persisted within claims inside the content_repository. To protect from data loss these repositories should be using protected storage such as RAID. So if NiFi were to suddenly crash or server itself crash, when NiFi is restarted on that down node it will load its flow and then load the FlowFile back in to the connections. Processing will begin again against those FlowFiles by downstream processor component. NiFi's design favors data duplication over data loss ir order to avoid data loss posibilities. For example: Let's assume that a NiFi processor completed execution against a FlowFile resulting in writing something out to an external endpoint. in response to that successful operation the processor would then move the FlowFile from the inbound connection to some a downstream relationship. If NiFi were to crash in that very moment before the FlowFile was moved, on startup the same FlowFile would load in the inbound connection and get processed again. Also keep in mind that you are running 3 node NiFi cluster and within a NiFi cluster each connected node runs its own copy of the flow, its own set of repositories, and its own local state. So each node is unaware of the FlowFiles being processed by another node in the same cluster. Generally speaking when you have a processor that shows active threads indicator on it and zeroed out stats, you either have a very long running thread or a hung thread (only examination of serious of thread dumps can make the determination. Most commonly this is a resource utilization problem. But could also be dataflow design issue, client library issue, or network issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎01-29-2024

@PriyankaMondal Just to add to What @ckumar provided, the NiFi repositories are not locked to the specific node. What i mean by that is that they can be moved to a new node, withe "new" being the key word there. A typical prod NiFi setup will use protected storage for its flowfile_repository and content_repository(s) which hold all the FlowFile metadata and FlowFile content for all actively queued and archived FlowFiles on a node. To prevent loss of data, these repositories should be protected through the use of RAID storage or some other equivalent protected storage. The data stored in these repositories is tightly coupled to the flow.xml.gz/flow.json.gz that cluster is running on every node. Let's say you have hardware failure, it may be faster to standup a new server then repair the existing hardware failure. You can simple move or copy the protected repositories to the new node before starting it. When the node starts and joins your existing cluster it will inherit the cluster's flow.xml.gz/flow.json.gz and then begin loading the FlowFiles from those moved repositories in to the connection queues. Processing will continue exactly where it left off on the old node. There is no way to merge repositories together, so you can not add the contents of one nodes repositories to the already existing repositories of another node. The provenance_repository holds lineage data, and the database_repository holds flow configuration history and some node specific info. Neither of these are needed to preserve the actual FlowFiles. Hope this helps, Matt

MattWho · ‎01-03-2024

@PriyankaMondal What is being logged in the nifi-user.log when the issue happens? Have you tried using your browser's developer tools to look at the data being exchanged in the request with the NiFi cluster? Feels like maybe the site cookies are not being sent to the NiFi node after successful authentication resulting in the exception being seen. Thanks, Matt

MattWho · ‎10-06-2023

@PriyankaMondal 1. Not clear on the question here. Why use Toolkit to create three keystores? I thought you were getting three certificated (one for each node) from your IT team. Use those to create the three unique keystores you will use. 2. It appears your DN has a wildcard in it. NiFi does not support the use of wildcards in the DN of node ClientAuth certificates. This is because NiFi utilizes mutualTLS connections and the clientAuth DN is used to identify the unique connecting clients and is used to setup and configure the authorizations. Now you could ask your IT team to create you one keystore with a non wildcard DN like "cn=nifi-cluster, ou=domainlabs, DC=com" and add all three of your Nifi node's hostnames as SAN entries in that one PrivateKeyEntry. This would allow you to use that same PrivateKey keystore on all three NiFi nodes. This has downsides liek security. If keystore on one node gets compromised, all hosts are compromised because it is reused. All nodes will present as same client identity (since all present same DN) during authorization. So nothing will distinguish one node from the other. The keystore used by NiFi can ONLY contain one privateKey entry. Merging multiple keystores with privateKey entries will result in one keystore with more than one PrivateKeyEntry which is not supported by NiFi. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎04-20-2025 01:03 AM

Member Since	‎10-17-2022 10:31 AM
Last Visited	‎04-20-2025 01:03 AM
Posts	17
Kudos received	3

Cloudera Community

Re: Issue in nifi upgrade from 1.23.2 to 2.3.0

Re: Apache Nifi IMAP/POP3 authentocation Error

Re: Data transformation in Nifi

Get total flowfile count in Nifi for a Process-Gro...

Re: Nifi can't perform delete operation if one nod...

Re: Nifi throughput calculation for minimun hardwa...

Re: Nifi Processor stops suddenly

Re: Data Replication on Nifi

Re: Nifi UI Behind Load Balancer

Re: Nifi Secure 3 node cluster with signed Certifi...