About MattWho

MattWho · ‎07-01-2022

@Brenigan Are you running your dataflow on a standalone NiFi install or a NiFi cluster install? If a multi node NiFi cluster, are all 200 FlowFiles on the same NiFi node? Does your partition_number start at 0? Do you see your FlowFiles getting routes to the overtook relationship after 10 minutes? Assuming all the following: 1. All FlowFiles are on same NiFi node 2. partition_number starts at "0" and "increments consistently by "1" 3. All FlowFiles have same filename 4. Wait relationship is route via a connection back to the EnforceOrder processor. You should be seeing: 1. All FlowFiles routed to the "wait" relationship until a FlowFile with attribute "partition_number" equal to "0" is processed which will result in that FlowFile routing to success. 2. Other FlowFiles meeting above 4 criteria will continue to loop through wait until "partition_number" attribute with value "1" is seen and routed to success. 3. If a FlowFile in incremental order is missing, all FlowFiles with a partition_number higher than the next expected integer will continue to route to wait relationship. 4. after the configured "wait timeout" any FlowFile that has been waiting this long will be routed to the "overtook" relationship. You can right click on a connection holding the FlowFiles and list the queue. From there you can select the "view details" icon to the far left to examine the FlowFiles current attributes. You should see a new attribute "EnforceOrder.expectedOrder" that contains the next expected integer value that the group this FlowFile belongs to is waiting for. You will also find your "partition_number" which will have the current integer for this FlowFile. If you have your FlowFiles distributed across multiple nodes in a NiFi cluster, you will need to get all FlowFiles with the same "group identifier" moved to the same NiFi node in order to enforce order (you can not enforce order across different nodes in a NiFi cluster). You can accomplish this by editing the connection feeding your enforceOrder processor and under settings select a "Load Balancing Strategy" of "Partition by Attribute" using the "filename" attribute that you are using as your group identifier in the Enforce Order processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-30-2022

@pandav You can not offload a NiFi node that is down. Can you clarify what you mean by "down"? Was the NiFi service not running on the nodes you attempted to offload? The offload option from the cluster UI sends a request to the disconnected (not down) node to offload its queued FlowFiles to nodes still connected to the cluster. If your nodes are down, you'll need to start the service on those nodes again. On startup (assuming no issues), these nodes will rejoin your cluster. If you plan to decomission a node later, you can use the NiFi cluster UI to manually disconnect a node and then offload that nodes FlowFiles. Once the FlowFiles have been successfully offloaded, the node can be deleted from the cluster using the NiFi cluster UI. Note: restarting a node that has been dropped/deleted from the cluster will trigger that node to start heartbeating to the cluster and thus reconnect unless you edit the configuration of the node so it does not use the same zookeeper znode as the current cluster (nifi.zookeeper.root.node property in nifi.properties file). https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#basic-cluster-setup As far as your nodes going down on a configuration change, you'll want to inspect the NiFi logs for an exceptions or timeouts that may have occurred. Network issues, long Garbage Collection (GC) pauses, and resource congestion/exhaustion can lead to nodes not responding or receiving the replicated change request. As a result a node can get disconnected. In the scenarios like this if you are using the latest Apache NiFi release, those nodes should automatically reconnect. Upon reconnect, if the nodes flow does not match the cluster flow, the node will automatically take the clusters flow and join. In order release a flow mismatch would between connecting node and cluster flow, would require manual intervention (copying the flow.xml.gz from a node still in the cluster to the node not connecting). If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-30-2022

@Gogineni Good to see that you are now getting a 401 instead of a 400. So this becomes more of an issue with what you endpoint rest-api expects in the get method. I am not sure what is in the content of you flowfile, but are you sure you want to send this in your GET method at this point in your dataflow? I am sure it was used earlier to fetch your access token, but probably not needed now. So try changing "Send Message Body" to "false". Also not sure how long the token you obtained is valid. Have you tried performing the same request via curl from command line? If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-30-2022

@araujo Once "nifi.content.repository.archive.enabled=false" is set to false, content claims that no longer have any FlowFiles referencing them will no longer get moved to "Archive" sub-directories. They instead simply get removed. The logic built around the backpressure checks to see if there is archived content claims still and if none, allows content to continue to be written until disk is full. If the archive claimant count is not zero, backpressure kicks in until that count goes to zero through archive removal. This backpressure mechanism is in place so that the archive clean-up process can catch-up. The fact that NiFi will allow content repo writes until disk full is why it is important that users do not co-locate any of the other NiFi repositories on the same disk as the content repository. If disk filling became an issue, the processors that write new content would not just stop executing. They would start throwing exceptions in the logs about insufficient disk space. @Drozu The original image shared of your canvas shows a MergeContent processor in a "running" state , but does not indicate an active thread at time of image capture. An active thread shows as a small number in the upper right corner of the processor. The processor image all shows that this MergeContent processor also executed 2,339 threads in just the last 5 minutes. Execution does not mean an output FlowFile will always be produced. If none of the bins are eligible for merge, then nothing is going to be output. When the processor is in a state of "not working" do all that processor stats go to 0 including the "Tasks/Time"? Does it also at that same time indicate a number in its upper right corner? This would indicate that the processor has an active thread that has been in execution for over 5 minutes. In a scenario like this, it is best to get a series of ~4 NiFi thread dumps to see why this thread is running so long and what it is waiting on. If the stats go to zeros and you do not see an active thread number on the processor, this indicates the processor is not getting a thread in last 5 minutes from the Timer Driven thread pool. Then you need to look at thread pool usage per node. Is the complete thread pool in use by other components? Thanks, Matt

MattWho · ‎06-30-2022

@Gogineni The dynamic property you added to the invokeHTTP processor is not using valid NiFi Expression Language (NEL). The name of your custom property "Authorization" is the header name and the evaluated NEL becomes the value to that header. What you have is: $.Authorization However, the valid NEL to use to return the value from the NiFi FlowFile Attribute "Authorization" created by the earlier UpdateAttribute processor would be: ${Authorization} If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-29-2022

@rafy Have you looked at the following processors: 1. GenerateTableFetch 2. QueryDatabaseTable 3. QueryDatabaseTableRecord Thanks, Matt

MattWho · ‎06-29-2022

@Drozu Switching off the content repository archiving would not result in an automatic clean-up of your archived content claim. Make sure that all the "archive" sub-directories in the numbered directories within the content-repository are empty. After disabling archive, any change in disk utilization on your 3 nodes? Did content repository disk fill to 100%? There are many things that go into evaluating performance of your NiFi and its dataflows. Anytime you add new components via the NiFi canvas, the dynamics can change. How many components are running? (if all 50 timer driven threads are currently in use by other components, other components will just be waiting for an available thread) How often is JVM garbage collection (GC) happening? How many timer driven threads are in use at time processors seems to stop? How are the queued FlowFiles to the MergeContent distributed across your 3 nodes? How many concurrent tasks on MergeContent? What do the Cluster UI stats show for per node thread count, GC stats, cpu load average, etc.? Any other WARN or ERROR log output going on in the nifi-app.log. (Maybe related to OOM or Open File limits for example)? Looks like you are using your mergeContent processor to merge two FlowFiles together that have the same filename attribute value. Does one inbound connection contain 1 FlowFile and the other contain the other FlowFile in the pair? The MergeContent is not going to parse through the queued FlowFiles looking for a match. How are you handling "Failure" with the MergeContent? It round robins each connection, so in execution A, it reads from connection 1 and bins those FlowFiles. Then on next execution, it reads from connection B. Try adding a funnel before your MergeContent and redirecting your two source success connection in to that funnel and dragging a single connection from the funnel to the MergeContent. Thank you, Matt

MattWho · ‎06-29-2022

@ajignacio You should carefully read all the migration guidance leading up to 1.16 starting with: Migrating from 1.x.x to 1.10.0 Take special note of: 1. Any nars that may have been removed and make sure your dataflows are not using any processors from those removed nars. 2. Any reported changes to specific components you may use in your dataflows. 3. Check that your dataflow does not have any processors with inbound connection scheduled to execute on "Primary Node" only (small P in upper left corner of processor). 4. Take note of migration step involving sensitive .props.key. If you had not set one previously, you may want to use the nifi toolkit to create a new user defined one and re-encrypt the sensitive property values in the flow.xml.gz using that new sensitive props key. 5. Make sure you upgrade to a java 8 NiFi 1.16 supported Java version before migration. NOTE: While Apache NiFi has limits on the maximum size fo the service forcing deprecation of older nars, Cloudera's CFM distributions of Apache NiFi do not and include almost all Apache nars in addition to Cloudera specific nars. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-29-2022

@rafy Each node in a NiFi Cluster has its own copy of the dataflow and executes independently of the other nodes. Some NiFi components are capable of writing cluster state to zookeeper to avoid data duplication across nodes. Those NiFi ingest type components that support this should be configured to execute on primary node only. In a NiFi cluster, one node will be elected as the primary node (which node is elected can change at any time). So if a primary node change happens, the same component on a different node will now get scheduled and will retrieve cluster state to avoid ingesting same data again. Often in these types of components, the one that records sate does not typically retrieve the content. It simply generates metadata/attributes necessary to later get the content with the expectation that in your flow design you distribute those FlowFiles across all nodes before content retrieval. For example: - ListSFTP (primary node execution) --> success connection (with round robin LB configuration) --> FetchSFTP (all node execution) The ListSFTP creates a 0 byte FlowFIle for each source file that will be fetched. The FetchSFTP processor uses that metadata/attributes to get the actual source content and add it to the FlowFile. Another example your query might be: GenerateTableFetch (primary node execution) --> LB connection --> ExecuteSQL The goal with these dataflows is to void having one node ingest all the content (added network and Disk I/O) only to then add more network and disk I/O to spread that content across all nodes. So instead we simply get details about the data to be fetched so that can be distributed across all nodes, so each nodes gets only specific portions of the source data. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎06-29-2022

@shuhaib3 The nifi.properties file is not the correct file to pass to the "-p" option for the NiFi Toolkit cli.sh. The "-p" expects you to pass a properties file you build with specific properties in it. For example: baseUrl=https://<target node hostname>:<target node port> keystore=/path/to/keystore.jks keystoreType=JKS keystorePasswd=changeme keyPasswd=changeme truststore=/path/to/truststore.jks truststoreType=JKS truststorePasswd=changeme proxiedEntity=nifiadmin The nifi.properties will not include these exact property names and include other properties not used by cli.sh. The following exception: ERROR: Error executing command 'current-user' : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target indicates a trust chain issue between client (cli.sh) and server (target NiFi). This means that the truststore is missing one or more TrustedCertEntry for the PrivateKeyEntry presented from the keystore in the mutual TLS handshake. Essentially the client initiates a connection to the server. The server responds with its serverAuth certificate along with a list of trusted authorities (TrustedCertEntry entries) from servers truststore. Every certificate private (PrivateKeyEntry) or public (TrustedCertEntry) has an owner (certificates distinguished name (DN)) and issuer (Distinguished name (DN) of signer of that certificate). The client looks at the issuer of the sever's certificate and checks it's truststore for a certificate owner with that same DN. If found it checks the issuer of that certificate to see if issuer and owner have same DN (self signed). If not the same, it looks again for a certificate with an owner matching that issuer DN. This continues until finds the root signing certificate (root certificate will have same issuer and owner). This compete chain of certificate authorities is known as the trust chain. If the complete trust chain is missing you get above exception. Same can happen in the other direction. Assume above is successful, then the client returns its clientAuth certifcate (keystore) to the server to authorize who the client is. The server (NiFi node) will verify trust in the same way using the truststore on the server side. So the complete trust chain for that client certificate must also exist on the server side. If complete trust chain exist here as well, the mutual TLS handshake can be successful. You can manually inspect the contents of your client and server side keystores and truststore files using the java keytool command. <path to java>/keytool -v -list -keystore <keystore or truststore> If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

Online	Offline
Last Visited	‎11-18-2025 07:56 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎11-18-2025 07:56 AM
Posts	3,406
Kudos received	1619

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: EnforceOrder processor doesn't work.

Re: Nifi balancing cause loss of data

Re: NIFI- I'm getting issue like " authenticatio...

Re: Problem with Merge Content Processor after swi...

Re: NIFI- I'm getting issue like " authenticatio...

Re: Data migration: Which processor to use?

Re: Problem with Merge Content Processor after swi...

Re: Nifi Upgrade

Re: Can NIFI nodes access different records on a D...

Re: Nifi CLI PKIX path building failed