Member since
07-30-2019
3374
Posts
1616
Kudos Received
998
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 31 | 10-20-2025 06:29 AM | |
| 171 | 10-10-2025 08:03 AM | |
| 146 | 10-08-2025 10:52 AM | |
| 143 | 10-08-2025 10:36 AM | |
| 203 | 10-03-2025 06:04 AM |
07-09-2025
10:21 AM
@PradNiFi1236 Not much here to work with. I suggest first comparing the NiFi configuration files between all your nodes to make sure they are all are identical with exception of hostnames, keystores, and truststores. Are you using a load balancer? If so, do you see a change in behavior if you enable session affinity (Sticky sessions) in your LB? Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-09-2025
10:14 AM
Possibly related to https://issues.apache.org/jira/browse/NIFI-14462 I suggest reviewing discussion in this Apache Jira and reviewing your "nifi.web.request.timeout" setting in the nifi.properties file. Making adjustments to this setting may help here. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-07-2025
05:54 AM
@MK77 First lets clarify the Zookeeper (ZK) elected roles in Apache NiFi. Primary: ZK elects one node in the cluster as the "Primary" node. Processor components on the canvas configured to with Execution=Primary node will only get scheduled on that elected primary node. No other nodes will schedule these processors to execute. Cluster Coordinator: ZK elects one of the nodes as the cluster coordinator. Other nodes learn which node is the elected cluster coordinator from ZK. All nodes will send node heartbeats to the cluster coordinator to form the cluster. Any node in the NiFi cluster can be assigned either or both of these roles. There is no guarantee that the same node(s) will always be assigned these roles. Even after NiFi cluster is formed and roles are assigned, which nodes are assigned these roles can change. The flow.json.gz contain the dataflows on the canvas that are loaded on startup. The flow.xml.gz is only loaded if the flow.json.gz is missing. If NiFi loads the dataflow from the flow.xml.gz, it will generate a flow.json.gz from that flow.xml.gz. Now on to your problem.... Neither of the log lines you shared point to any problem: Invalid State Cannot replicate request to Node <node-hostname:port> because the node is not connected This log line simply tells you that this node can't replicate a request to anothetr node yet because it has not has not connected yet to the cluster. o.a.n.w.a.c.IllegalClusterStateExceptionMapper org.apache.nifi.cluster.manager.exception.IllegalClusterStateException: The Flow Controller is initializing the Data Flow.. Returning Conflict response. This simply tells you that the flow.json.gz is still being initialized (loaded). This process needs to complete before the node finishes startup and can join the cluster. Depending on which Apache NiFi version you are running and the size of yoru dataflow, this can take some time to complete. What is the complete version of NiFi you are using? Without your full logs it is not possible from what has been shared to tell you what is going on or even if there really is any corruption with your flow.json.gz. One thing you can do is configure yoru NiFi to startup with all components on yoru canvas stopped instead of their last known state. This can be helpful if you have added a recent new dataflow that is perhaps causing issues initializing at startup. This achieved by changing the following setting in the nifi,properties file. Save a backup of your flow.json.gz before starting after changing this setting. The saved flow.json.gz will have the original saves state (Running, Stopped, Disabled) of all the components. nifi.flowcontroller.autoResumeState=false If your NiFi cluster starts fine after making this change, you can restart your dataflows to see if any are having issues. Beyond the above suggestion, there is not enough information shared to suggest anything else. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-03-2025
06:36 AM
@HoangNguyen There isn't an existing processor included with Apache NiFi capable of performing an UNION ALL against the contents of multiple FlowFiles. The JoinEnrichment is the only processor that can modify the contents of one FlowFile using the contents of another, but that only handles two FlowFiles (original FlowFile and enrichment FlowFile) in a single execution. The other record orientated processor all perform actions against an individual record in a FlowFile. You may need to develop your own custom processor for such a task. Something like the MergeRecord processor that bins like FlowFiles and then performs a UNION ALL on those binned FlowFiles. You could also raise a Jira in Apache NiFi (https://issues.apache.org/jira/browse/NIFI) asking for a processor that can perform such an operation and maybe someone would attempt to build it if their us enough Apache Community interest. You could also explore what Cloudera offers to its customers in terms of professional services that could help with building custom processors for Cloudera Flow Management offerings based off Apache NiFi. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-03-2025
06:14 AM
@Rohit1997jio https://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html Your quartz cron "0-30 */6 * * * ?" translates to: Execute every second from 0 - 30 seconds 6 minutes after every minute of every hour ... I think your issue is using */6 because you are saying 6 minutes after every minute which is effectively the same thing as having just * in the minutes field. If you change this to 0/6, the processor would get scheduled 0/6/12/18/24/30/36/42/48/54 every hour. If you want it to start at 6 minutes you would use 6/6 which would schedule processor at 6/12/18/24/30/36/42/48/54 every hour (you would however have a 12 minute gap between end of each hour and 6 minutes of next hour with this config). Also keep in mind that Scheduling does not necessarily mean execution at same time. NiFi has a Max timer driven thread pool from which threads are given out to scheduled processors. With very large flows or processors with long running threads, scheduled processor may need to wait for a thread to become available to actually execute. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-03-2025
05:56 AM
@NifiEnjoyer Welcome to the community. As this thread is related to the deprecation of NiFi templates in Apache NiFi 2 and an old thread, it would be better to start a new community question with your query about downloading and uploading flow definitions. You'll want to include yoru source and destination Apache NiFi versions in your question details. Fell free to @MattWho in your new community question. Thank you, Matt
... View more
07-02-2025
08:31 AM
@HoangNguyen All the ForkEnrichment processor does is add two specific FlowFile Attributes to each FlowFile it outputs: The JoinEnrichment processor depends on receiving two FlowFiles with Matching "enrichment.group.ids" and one with "enrichment.role" = ORIGINAL and other FlowFile with "enrichment.id" = ENRICHMENT. So you can do something like this for example: In the above you above you fork the staring FlowFile and then join that first Enrichment, then you use ForkEnrichment again to generate the needed FlowFile attributes for the second Join enrichment operation. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-02-2025
05:26 AM
@Rohit1997jio The content of a NiFi FlowFile does not live in NiFi heap memory space. Only the FlowFile Metadata/Attributes are held in NiFi heap memory. Even then there are thresholds per connection in which swap files would be created to reduce that heap usage. Some Processors may need to load content into heap memory when they execute against a FlowFile(s). Before making recommendations on your ConsumeKafkaRecord processor configuration, more information about your NiFi and Kafka topic are needed. Are you running a multi-node NiFi cluster or a Single instance of NiFi? If a cluster, how many nodes make up yoru NiFi cluster? How many partitions are setup on the target kafka topic? Kafka partitions are assigned by Kafka to different consumers in consumer group. So lets say you have 10 partitions on your kafka topic, 1 NiFi instance, and a consumeKafka configured with 1 concurrent task. all 10 of these partitions would be assigned to that one consumer. When the ConsumeKafkaRecord executes, it will consume from one of those partitions, next execution from the next partition, and so on. This is likely why you are not seeing all the kafka messages consumed when you schedule the processor to execute only once every 4 hours. Even if you were to set concurrent tasks to 10 on the ConsumeKafkaRecord processor, the scheduler is only going to allow one execution every 4 hours. So in this case you would be best suited to set 10 concurrent tasks and adjust your Quartz Cron scheduler so it schedules every second for 10 seconds every 4 hours. Also keep in mind the "Max Poll Records" setting as in controls max records(messages) to add to single record FlowFile created during each execution. If you have a lot of records you may consider increasing how many times it get scheduled every 4 hours to maybe 30 seconds to make sure you get all messages form every partition. Now assuming you have a multi-node NiFi cluster with 5 nodes for example, your consumeKafkaRecord processor is configured with a group.id, and 10 partitions. You would set concurrent tasks to 2 (2 consumers X 5 nodes = 10 consumers in the consumer group). Kafka will assign one partition to each of these 10 consumers in the consumer group. Hope this helps you configure your ConsumeKafkaRecord processor so you can be successful with your requirement. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-01-2025
05:50 AM
@Bhar Can you share more detail? Without it, I would only be making random guesses. What version of Apache NiFi are using? Is this a single instance of NiFi or a NiFi multi-node cluster? How is your MergeContent processor configured? Thank you, Matt
... View more
07-01-2025
05:44 AM
@HoangNguyen Welcome to the community. It would be very difficult to provide any suggestions with the limited information you have shared. Please share more detail about your use case and what you are trying to accomplish. The JoinEnrichment processor is used in conjunction with the ForkEnrichment processor. For a JoinEnrichment processor to join two NiFi FlowFiles, those two FlowFiles must both have a matching group id set in an "enrichment.group.id" attribute on each FlowFile and must also have an Attribute" enrichment.role" set appropriately on each FlowFile (ORIGINAL set on FlowFile to be enriched and ENRICHMENT set on the FlowFile containing the enrichment data). Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more