Member since
07-30-2019
3172
Posts
1571
Kudos Received
918
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
70 | 02-14-2025 08:43 AM | |
75 | 02-12-2025 10:34 AM | |
423 | 02-06-2025 10:06 AM | |
139 | 01-31-2025 09:38 AM | |
125 | 01-30-2025 06:29 AM |
04-10-2017
03:29 PM
@Blake Colson If you found my initial response helpful in answering your question, please accept the answer.
... View more
04-10-2017
03:18 PM
2 Kudos
@Emily Sharpe The templates directory was left around to assist those users who are moving from NiFi 0.x to NiFi 1.x baseline. Since NiFi 0.x placed all generated templates in this directory, you can copy those templates over to the configured directory in NIFi 1.x and NiFi will load them in to the flow.xml.gz file for you on startup. There really is no other use for this property other then the above migration. Thanks,
Matt
... View more
04-10-2017
03:07 PM
@Blake Colson I am not complete clear on your response. Each NiFi instance has its own UI. In a NiFi cluster no matter which node's Ui you load, the canvas you will see is that of the cluster. Every Node in a NiFi cluster must run an identical dataflow. When you stand up your production NiFi instance or cluster, it will have its own UI and own canvas. You cannot manage your dev, qa, and prod clusters from the same UI. Once a dataflow exists on a canvas, you can start any portion or all the components (processors, controller services, input ports, output ports, etc..). There is no dependency that the UI/canvas remain ope in your browser in order for those dataflows to continue running. There is also no requirement that you template your entire dataflow each time. You can template only portions of a dataflow and move that portion between dev, qa, and prod. The SDLC model is not completely there yet in NiFi. Lets say you have template dataflow version 1 and now you want to deploy version 2. Going the template route would require you to bleed out all the data traversing the current version 1 that is running. I would upload version 2 of the template and add it to my canvas. Then stop any ingest processors in your version 1 flow already on the canvas. Allow the remaining processors to continue to run so that all data is eventually processed out of your version 1 flow. Start your version 2 dataflow so it starts ingesting all data at that point. Once the version 1 flow no longer has any data queued in it any longer, you can select its components and delete them from the canvas. Thanks, Matt
... View more
04-10-2017
02:26 PM
1 Kudo
@Michael Silas There is nothing specific in the users.xml or authorizations.xml file that is specific to any node. In Fact these files are checked on startup to make sure they are identical between all your NNiFi cluster nodes. In order for a node to successfully join a cluster the flow.xml.gz, users.xml, and authorizations.xml files must match. If you configure a new node to join an existing cluster and you have none of the above three files and have not configured the authorizers.xml file on the new node, the new node will inherit/download these three files from the cluster automatically. Even if you have not added the new node to the "proxy user requests" global access policy yet, you should be able to still connect to the UI of your other nodes and add it afterwards. Again, you are adding more for work for yourself by deleting the users.xml and authorizations.xml files. Thanks, Matt
... View more
04-10-2017
02:13 PM
2 Kudos
@Blake Colson The answer to that question lies in what version of NiFi you are running.... I will assume you are running the latest version in this response. (NiFI 1.2.0 or HDF 2.1.2 as of time this was written). You have two option for moving your entire dataflow from one system to another. 1. Copy the flow.xml.gz file from one NIFi instance to another. - This method requires that both NiFi instances use the same configured sensitive props key in their nifi.properties files. The sensitive props key is used to encrypt the sensitive properties (passwords) in the various components on your canvas. If they don't match, your new NiFi will not be able to load using the flow.xml.gz file you copied over. - Benefit of this method is you get your entire flow including all configured passwords. 2. Create a template of your entire dataflow, download it and then import it into you new NiFi. - Provide a name and description for you template - Once template is created, you will nee to download it: - Click the download icon to the right of your newly created template. - You now have a copy of your template to import in to you e new NiFi. - Once uploaded you can then add the template to your canvas by dragging the "Template" icon to your canvas: - Select your newly uploaded template form the selection list. - Templates when created are sanitized of any sensitive properties so they can be used in other Nifi instances. You will need to go to any processor that used a sensitive property and re-enter those sensitive values when using this method. Thank you, Matt
... View more
04-10-2017
12:49 PM
@Michael Silas It is likely in an existing cluster that you have establish a number of user policies beyond the default "Initial Admin Identity". You do not want to delete the users.xml or authorizations.xml file at this time as you will lose all those new users and authorizations. Instead, add the new node as a /proxy user before actually adding the node to the cluster. You can copy the users.xml, authorizations.xml, and flow.xml.gz to you new node if you want at that time. - Agree - create a new cert for that node. If using the NiFi CA, you can simply click teh biox for regenerate certificates in Ambari (Available in HDF 2.x releases) - Agree that you need to be mindful of any custom nars as well as any referenced local files as these all need to copied to your new node as well. Matt
... View more
04-10-2017
12:42 PM
Only NiFi versions Apache NiFi 0.x or HDF 1.x have a NCM based cluster.
NiFi versions NiFi 1.x or HDF 2.x moved to a zero master clustering which no longer relies on a NCM.
... View more
04-10-2017
12:40 PM
@Dmitro Vasilenko Are you seeing an error or warn log message produced by the ConsumeKafka processor when you run it?
What is the processors configured run strategy/schedule?
... View more
04-07-2017
01:10 PM
@Paul Yang
The election of a Primary node and the Cluster Coordinator occurs through Zookeeper. Once a Cluster Coordinator is elected, all nodes will begin sending heartbeats directly to the elected primary node. If a heartbeat is not received in the configured threshold, that node will be disconnected. A single node disconnecting / reconnecting node may indicate a problem with just that single node.(Network latency between node and Cluster coordinator, garbage collection (stop the world event) that prevents node from heartbeating to cluster coordinator, etc... Check the NiFi app log on your nodes to make sure they are sending heartbeats regularly.) In your case you mention the Cluster Coordinator changes nodes frequently. This means that a new node is being elected as the Cluster Coordinator by zookeeper. This occurs when the current cluster coordinator has trouble communicating with zookeeper. Again, garbage collection can be the cause. Their is a known bug in HDF 2.0 / NiFi 1.0 (https://issues.apache.org/jira/browse/NIFI-2999) that can result in all nodes being disconnected when the cluster coordinator changes hosts. Since nodes send heartbeats directly to the current cluster coordinator, whomever is the current cluster coordinator keeps track of when the last heartbeat was received from a node. Lets assume a 3 node cluster (Node A, B, and C) Node A is current Cluster coordinator and is receiving heartbeats. At some point later Node B becomes the cluster coordinator and all nodes start sending heartbeats there. The bug which has been addressed occurs if at some point later Node A should become the cluster coordinator again. When that happens Node A looks at the last time it received heartbeats which it has since it was previously the cluster coordinator,but since they are all old, every node gets disconnected. They then auto-reconnect on next heartbeat. You can upgrade to get away from this bug (HDF 2.1 / NiFi 1.1), but ultimately you need to address the issue that is causing the cluster coordinator to change nodes. This is either a loading issue where there are insufficient resource to maintain a connection with zookeeper, an overloaded zookeeper, a zookeeper that does not have quorum, Node garbage collection issue resulting in to long of a lapse between zookeeper connections, etc... Thanks, Matt
... View more
04-07-2017
12:26 PM
@Sanaz Janbakhsh Unfortunately a formula for what percentage of your disk should be allocated to each repo does not exist and would frankly be impossible to establish considering so many dynamic inputs come in to play. But to establish a staring point from which to adjust from, I would suggest the following: 10% - 15% --> FlowFile Repository 5% - 10% --> Database repository 50% - 60% --> Content Repository ? --> Provenance Repository (Depends on your retention policies, but Provenance repo size can be set to a restricted size in Nifi configs. Default is 1 GB disk usage or 24 hours. Soft limits so it may temporarily exceed the size threshold until clean-up occurs, so don't set size to exact size of partition it is configured to use.) 10%- 15% --> /logs (This is very subjective as well. How much log history do you need to retain? What default log levels have you set? While the /logs directory may stay relatively small during good times, an outage can result in a logging explosion. Consider a downstream system outage. All NiFi processors that are trying to push data to that downstream system will be producing ERROR logs during that time.) The above assumes your OS and applications are installed on a different disk. If not you will need to adjust accordingly. Thanks, Matt
... View more