Member since
07-30-2019
3470
Posts
1641
Kudos Received
1018
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 251 | 05-06-2026 09:16 AM | |
| 439 | 05-04-2026 05:20 AM | |
| 317 | 05-01-2026 10:15 AM | |
| 509 | 03-23-2026 05:44 AM | |
| 385 | 02-18-2026 09:59 AM |
01-31-2024
08:28 AM
1 Kudo
@plapla The consumeKafka processor should not be reading the same message twice. The processor should be maintaining local state (since you are not clustered) in NiFi's local state directory. Make sure that you are not having disk space or permissions issues that may prevent the writing of that local state. You can write click on the ConsumeKafka processor to view the current stored state. The consumeKafka processor creates a consumer group using the GroupID configured in the processor, so make sure you do not have multiple consumeKafka processor consuming from the same Kafka Topic using the same Group ID. For optimal performance the number of concurrent tasks configured on the consumeKafka processor should match the number fo partitions on target topic. Do you see any Kafka rebalance going on? Will happen when you have more consumers than partitions in a the consumer group that is consuming from that topic. A rebalance can affect the commit of the offset resulting in possible data duplication. Thanks, Matt
... View more
01-30-2024
06:02 AM
@FrankHaha I am a little confused on your ask due to the terminology used. A NiFi template (deprecated and removed in NiFi 2.x) is a reusable NiFi dataflow snippet (collection of interconnected components and controller services in XML format). Templates have been replaced by "flow definitions" (similar to templates but in json format). You can't execute a template or a flow definition. You can deploy a template or flow definition to the canvas of an installed and running NiFi instance. Anything you can do via the NiFi UI, you can also accomplish via NiFi rest-api calls. The easiest way to learn what rest-api calls are needed and the format of each of those rest-api calls is through the use of yoru browsers built-in developer tools. You can perform each action via the UI and "capture as curl" through the browser developers tools "network" tab the rest-api call that was made. This includes importing a flow definition or template, modifying components imported, enabling, starting, stopping, etc) You can put those calls into a script to perform those same commands later without using the UI. Another option might be through the use of the NiFi CLI toolkit which offers a variety of commands for doing similar functions as the rest-api calls. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
01-29-2024
05:58 AM
@manishg Same about of flowFiles per second processing after switching to the Volatile repositories? Perhaps having FlowFile and provenance repositories in memory allows for faster processing of FlowFIles resulting in more read and writes to the content_repository which contains the actual content of each FlowFile. If your NiFi should crash or restart you will lose everything in your volatile repositories. The FlowFile repository holds all the FlowFile metadata for the FlowFiles currently being processed through your dataflows. This means data loss in such events. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
01-29-2024
05:44 AM
@PriyankaMondal Just to add to What @ckumar provided, the NiFi repositories are not locked to the specific node. What i mean by that is that they can be moved to a new node, withe "new" being the key word there. A typical prod NiFi setup will use protected storage for its flowfile_repository and content_repository(s) which hold all the FlowFile metadata and FlowFile content for all actively queued and archived FlowFiles on a node. To prevent loss of data, these repositories should be protected through the use of RAID storage or some other equivalent protected storage. The data stored in these repositories is tightly coupled to the flow.xml.gz/flow.json.gz that cluster is running on every node. Let's say you have hardware failure, it may be faster to standup a new server then repair the existing hardware failure. You can simple move or copy the protected repositories to the new node before starting it. When the node starts and joins your existing cluster it will inherit the cluster's flow.xml.gz/flow.json.gz and then begin loading the FlowFiles from those moved repositories in to the connection queues. Processing will continue exactly where it left off on the old node. There is no way to merge repositories together, so you can not add the contents of one nodes repositories to the already existing repositories of another node. The provenance_repository holds lineage data, and the database_repository holds flow configuration history and some node specific info. Neither of these are needed to preserve the actual FlowFiles. Hope this helps, Matt
... View more
01-26-2024
06:14 AM
@plapla Since Apache NiFi does not have a ConsumeKafka processor build with the Kafka 2.6 client, i would recommend going with the client closest to but not newer than the Kafka server version you are using. In this case the ConsumeKafka_2_0 processor. 1. Is your NiFi a standalone NiFi instance install or a multi-node NiFi cluster setup? 2. How are your distributedMapCacheClient and DistributedMapCacheServer controller services configured? 3. What is the rate of FlowFiles being produced by your consumeKafka processor? I see that you configured the DetectDuplicate processor with an age off of 420 days; however, DistributedMapCache server has a configured max cache entries with a default of 10,000. So possibly due to volume cache entries are being removed from the cache server resulting in issues detecting duplicates. The DistributedMapCache also holds all cache entries in NiFi heap memory and is not a good cache server to use in high volume caches (because of heap usage). DistributedMapCache also offers no high availability. This is becomes even more of an issue with a NiFi cluster. You would be better off using an external map cache server. If you are using a NiFi cluster, make sure your DistributedMapCacheClient is configured to connect to one specific DistirbutedMapCacheServer. I have seen misuse here where individuals configured it to connect to local host or each NiFi node to its own host's map cache server. The Map cache servers running on each node do not share data. Hope this helps you... If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
01-24-2024
06:56 AM
1 Kudo
@sukanta I am not completely clear on what you are asking for here. Can you provide more detail? What do you mean by "custom error page in NiFi"? Are you referring to the NiFi bulletin Board? Only users who are authorized for the component producing the bulletin message woudl be able to view the bulletin text via the component or bulletin board. Thanks, Matt
... View more
01-23-2024
08:33 AM
@plapla Couple things here: 1. You should be using the ConsumeKafka processor that matches your Kafka server version. If your KafKa server is 2.6, you should be using the ConsumeKafka_2_6 processor. 2. In your detectDuplicate processor you are using ${key}. How is this Attribute beig created on each FlowFile and where is it's value derived from? Thanks, Matt
... View more
01-19-2024
12:20 PM
3 Kudos
Big shout out to all the amazing contributors to this community both via question solution assistance and valuable articles!!! Together we enable great people to accomplish great things.
... View more
01-19-2024
10:15 AM
2 Kudos
@Sartha You are still not using the correct URL in your postHTTP as I stated in last response: It needs to be based on what you shared in yoru last response: http://10.73.121.84:5026/contentListner The ListenHTTP processor defines the "Base Path" on which it is listening for inbound connections. The default value is "contentListener". So the PostHTTP needs to post to that Base Path. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
01-19-2024
10:05 AM
3 Kudos
@Dave0x1 That is a big jump in versions from 1.13 directly to 1.24. Use NiFi toolkit instead to change the algorithm. https://nifi.apache.org/download/ NiFi Toolkit 1.24.0 ./encrypt-config.sh -n <nifi.properties from original 1.13 NiFi> -f <flow.xml.gz from original 1.13 NiFi> -x -s <sensitive props key from NiFi> -b <bootstrap.conf from original 1.13 NiFi> -A NIFI_PBKDF2_AES_GCM_256 -g <new 1.24 flow.xml.gz filename> Then in your NiFi 1.24 remove or rename the current flow.xml.gz and flow.json.gz files. Place the flow.xml.gz output from above toolkit command into same location and make sure permissions and ownership are correct. Start your NiFi 1.24. Since the flow.json.gz does not exist, NiFi will load the flow.xml.gz and upon successful startup generate the new flow.json.gz file it will load from that point forward each time NiFi is restarted. Hope this works for you. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Mat
... View more