Member since
07-30-2019
3406
Posts
1622
Kudos Received
1008
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 134 | 12-17-2025 05:55 AM | |
| 195 | 12-15-2025 01:29 PM | |
| 133 | 12-15-2025 06:50 AM | |
| 260 | 12-05-2025 08:25 AM | |
| 418 | 12-03-2025 10:21 AM |
04-10-2019
06:55 PM
@Samar Aarkotti *** Community Forum Tip: Try to avoid starting a new answer in response to an existing answer. Instead use comments to respond to existing answers. There is no guaranteed order to different answer which can make it hard following a discussion. It always best to leave your processor at default value for concurrent task unless there is a specific need to increment. Here is an article on this topic: https://community.hortonworks.com/articles/221808/understanding-nifi-max-thread-pools-and-processor.html and another on "Run Duration": https://community.hortonworks.com/articles/221807/understanding-nifi-processors-run-duration-functio.html
... View more
04-10-2019
06:32 PM
@Samar Aarkotti The exception you are seeing can be expected because of the concurrent execution you have going on per node. With 2 concurrent tasks, you have the processor potentially executing its code in twice in parallel resulting in on thread updating what is in state before the other thread does. - In the event that the UpdateAttribute processor is unable to get the state at the beginning of the onTrigger, the FlowFile will be pushed back to the originating relationship and the processor will yield. If the processor is able to get the state at the beginning of the onTrigger but unable to set the state after adding attributes to the FlowFile, the FlowFile will be transferred to "set state fail". This is normally due to the state not being the most up to date version (another thread has replaced the state with another version). In most use-cases this relationship should loop back to the processor since the only affected attributes will be overwritten. - I would suggest when using state in the updateAttribute processor that you configure the processor to with only 1 concurrent task. Keep in mind that the processor settings are per node so each node in you cluster will still be executing this processor. - If throughput is not meeting your needs, make sure you have properly load-balanced the source FlowFile across all nodes in your cluster. If you are and throughput is still an issue, try adjusting the "Run Duration" in very small increments and still leave concurrent tasks to 1. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
04-10-2019
06:13 PM
1 Kudo
@Kevin Lahey 1. Each NiFi node in a cluster runs its own copy of the flow.xml and processes its own set of FlowFiles. Node are unaware of what FlowFiles exist on other nodes in the cluster. 2. In much older versions of NiFi (Apache 0.x versions), NiFi did not have any High availability at the control level within a cluster. There existed a dedicated NiFi instance known as the NiFi Cluster Manager (NCM). This was the only instance in the NiFi cluster that could be accessed. All the nodes connected to this NCM. If NCM went down the entire NiFi cluster was not reachable. As of Apache NiFi 1.x+ the NCM no longer exists and the cluster relies on Zookeeper to elect one of the cluster nodes to handle role of Cluster Coordinator and Primary node. If the currently elected node(s) for these roles goes down, a new load is elected to these roles. In this way HA at the control level was provided. When you create any component (processor, controller service, reporting task, etc...), those components are replicated to all nodes in the cluster. So yes, the DistributedMapCacheServer controller service would be running on all nodes. If you then configured the DistributedMapCacheClient to use "localhost", then each node would be reading and writing to different cache servers. The DistributedMapCacheClient should be configured to point at a specific node rather than localhost. As you can see you have no HA in this type of setup since you are dependent on that one node hosting the cache server you are using to always be up. Instead you shoudl be using one of the external cache options like HBase in order to have HA. 3. As explained above, there is not such thing as a NCM as of Apache NiFi 1.x+ 4. Every component you add to the NiFi canvas is running within a single JVM on each NiFi node. So you cannot configure multiple components that bind to the same configured port anywhere. The first component will bind to port and when the other components are started they will throw an exception about port already in use. You can have as many clients (DistributedMapCache Client) as you like, since they act as a client and do not bind to a port. Only the server binds to the port so it can listen for client requests. Hope this helps
... View more
04-10-2019
05:45 PM
Or are you looking for a unique ID that is assigned to every FlowFile that is traverses a specific set of processors (dataflow) on the NiFi canvas. That way you can track which FlowFiles were processed by which dataflow? If so, rather then using the UUID() NiFi Expression Language (EL) function , why not just have an updateAttribute processor in each on of these unique dataflows set a static value on each FlowFile? for exampe: sessionID = dataflow1
... View more
04-09-2019
02:22 PM
@Abhinav Joshi *** Community Forum Tip: Try to avoid starting a new answer in response to an existing answer. Instead use comments to respond to existing answers. There is no guaranteed order to different answer which can make it hard following a discussion. I would suggest searching the nifi/work directory for multiple versions of the update-attribute nar bundle. You may have multiple nars of different versions installed. The flow.xml.gz file does contain the specific processor version for each component. When starting NiFi 1.9 using the flow.xml.gz from another NiFi version, the component versions will automatically be updated to the new version only if a single option exists. If you have an updateAttribute-1.8.<custom> and an updateAttribute-1.9.0 version available and the flow.xml.gz has an updateAttribute-1.8.0 then it will not auto-update because there are two options and it does not know which should be used. - My guess here is that your NiFi 1.8.0 contained both the standard 1.8.0 version of the the updateAttribute processor and a custom version of the updateAttribute processor. Then your flow contained updateAttribute components of each, Then you upgraded to NiFi 1.9.0 which replaced the stock updateAttribute with 1.9.0 and the custom version of Update Attribute processor was also carried over to your NiFi 1.9.0 install. - Thanks, Matt
... View more
04-09-2019
01:48 PM
@Kevin Lahey I completely agree with @Shu. I sounds like you have ListS3 processor executing on all 4 nodes in a NiFi cluster. This results in each NiDi node listing the same filename. This means that each node is then trying to lookup that filename in the distributed cache used by the detectDuplicate processor. This results in a bit of a race condition between you nodes where one or more nodes fails to find entry in cache before 1 of the nodes adds this new filename to that cache. - You flow should be running the ListS3 processor with it success relationship feeding a FetchS3 processor. That connection between those two processors should be configured to load balance the listed files across all nodes in cluster. - Thanks, Matt
... View more
04-08-2019
08:15 PM
@Abhinav Joshi A processor will appear with a dashed line around perimeter of the processor box for a couple reasons 1. It is a ghost implementation --> when loading the flow.xml.gz a processor is encountered that uses a custom nar which does not exits in the current NiFi installation. 2. The user logged into the canvas does not have required permissions to view the component. - Based on your description, it does not sound like scenario 2. The question is why did NiFi 1.9 not find a processor in NiFi 1.9 that used same processor class. Since you upgraded from NiFi 1.8, the update Attribute processor should have referenced the org.apache.nifi.processors.attributes.UpdateAttribute class for version 1.8.x. During an upgrade of NiFi the same identical class would have been found except for a newer version. In this case NiFi would have automatically switched to using the new version. - So this raises two questions: 1. Did you customized version of the updateAttribute processor running in NiFi 1.8? 2. Do you have multiple copies of the same processor class but with different versions loaded in your NiFi 1.9? - I would suggest inspecting the nifi-app.log from the time of startup immediate following the upgrade. If the UpdateAttribute processors was replaced by a "ghost" processor you would see that logged in the nifi-app.log. This log should help to see why it chose to load a ghost processor. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
04-08-2019
06:46 PM
@rohit chavan Absolutely, Simply configure your listenHTTP processor with a SSLContextService NiFi Controller Service. The controller service with be used to provide the keystore and truststore necessary to facilitate the two-way TLS/SSL handshake from the connecting client(s). When configured with a SSL Context Service, the processor's jetty server will only accept TLS 1.2 connections. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
04-08-2019
01:37 PM
@Abhinav Joshi What was the reason given for why the UpdateAttribute processors were now invalid?
... View more
04-01-2019
04:10 PM
@Micah Pearce There are numerous ways to accomplish the same result in NiFi. The use of the ${retry.counter:replaceNull('0'):plus(1)} NiFi Expression language statement is to create an attribute with value 1 if attribute "retry.counter" does not exist on the FlowFile; otherwise, the existing value currently assigned to existing attribute "retry.counter" is returned and incremented by 1. - If we just set it to one, each time the same FlowFile passes through this loop it would get set to 1 and would never increment. - Thank you, Matt
... View more