Member since
10-20-2022
6
Posts
0
Kudos Received
0
Solutions
06-13-2023
02:59 PM
1 Kudo
Tracking here: https://issues.apache.org/jira/browse/NIFI-11682
... View more
10-24-2022
07:55 AM
@PriyankaMondal I don't recommend using the NiFi Embedded Zookeeper (ZK). It makes things easy, but not an ideal solution for production. ZK requires a quorum of 3 nodes minimum. With NiFi configured to to use the embedded ZK, this would require your NiFi cluster to have at least 3 nodes. Without a quorum ZK cannot perform its required role. ZK is used to elected the NiFi cluster required cluster coordinator and primary node roles. Also when using embedded ZK, even with 3 NiFi nodes, the ZK won't achieve quorum until all three nodes are up and then you'll see messages like you shared until ZK cluster has formed and quorum established. Your cluster can also break (lose access to UI) if you lose nodes (NiFi shutdown or dies) because you also end up losing the embedded ZK and thus quorum is lost. I suggest going to each of your 3 NiFi servers Svxxx.xyz.com (1), Svxxx.xyz.com (2) and Svxxx.xyz.com (3) to make sure that ZK started and is listening on port 2181. I am assuming these are really three different hosts with unique hostnames and not that you tried to create 3 ZK on one host. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
01:25 PM
@DGaboleiro I am a bit confused by yoru dataflow design. In a NiFi multi-node cluster, each node is only aware of and can only execute upon FlowFiles present on that one node. So in your Dataflow you have the QueryCasandra processor executing on "primary node" only as you should (having it execute on all nodes would result in both your nodes performing same query and returning same data). You then Split that Json and use a DistributeLoad processor for what appears to me as means to then send some FlowFIle to node 1 and other half to node 2. This is not the best way to do this. You are running Apache NiFi 1.17 which means that load balanced connections are possible that can accomplish the same without all these additional processors. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings After your FlowFiles (this is what is being moved from processor to processor on your canvas) have been distributed I see that you are a MergeContent processor. The MergeContent processor can only merge the FlowFiles present on the same node. It will not merge FlowFiles from multiple nodes to a single FlowFile. So if your desire is to have one merge of all FlowFiles, distributing them across multiple nodes will not give you that desired outcome. You should never configure any processor that accepts an inbound connection for "primary node" only execution. This is important since which node is elected as primary node can change at anytime. Execution strategy has nothing to do with the availability of FlowFiles on each node on which to execute. What is important to understand is that each node in yoru NiFi cluster has its own copy of the Flow, its own set of Content and FlowFile repositories contain unique data, and each nodes executes the processors in its flow with no regard of the existence of other nodes. A node is simply aware from Zookeeper if it has been elected as the cluster coordinator and/or primary node. If it is elected primary node, it will execute "primary node" and "all nodes" components. If it is not the primary node, it will only execute the "all nodes" components. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-20-2022
08:56 AM
Hey, Generally JOLT tries to maintain the input order. The only way that I can think about is an hard-coded approach. Add this after the default spec: {
"operation": "shift",
"spec": {
"Order": {
"*": {
"top-level": {
"ProcessType": "[&2].ProcessType",
"FreightTerm": "[&2].FreightTerm",
"OrderNumber2": "[&2].OrderNumber2",
"TransmissionCommand": "[&2].TransmissionCommand",
"SequenceNumber": "[&2].SequenceNumber",
"SenderAbbreviation": "[&2].SenderAbbreviation"
},
"CustomerAddress": "[&1].CustomerAddress",
"ShipmentAddress": "[&1].ShipmentAdress",
"Volume": "[&1].Volume",
"Weight": "[&1].Weight"
}
}
}
} Also, are you trying to "duplicate" the value of "OrderNumber2"? The other solution I can remember to "replace" JOLT is by using FasterXML/Jackson
... View more