Member since
07-30-2019
3467
Posts
1641
Kudos Received
1016
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 134 | 05-04-2026 05:20 AM | |
| 437 | 03-23-2026 05:44 AM | |
| 328 | 02-18-2026 09:59 AM | |
| 576 | 01-27-2026 12:46 PM | |
| 1009 | 01-20-2026 05:42 AM |
03-27-2020
12:56 PM
1 Kudo
@venk What you have run into at this point is a known issue. Your cluster was originally setup and running unsecured over HTTP port 8080. NiFi records the details of the nodes that are part of the cluster. It does that so on later restarts it know that it should still be waiting on additional nodes to join before allowing users to make changes to the canvas. The downside to this is that when you switched to being secured over HTTPS on port 9091, the cluster now thinks you should have twice the number of nodes as there really are. But this is an easy fix. Within your NiFi's conf directory you will find the file "state-management.xml". Inside that file you will find a section for NiFi's "local-provider" that will contain the directory where you can find your local state. This path is normally the same on every node. Shutdown your NiFi and go to this directory on every node in your cluster and delete the contents within that state directory. Restart your NiFi and it will only create new entries for your secured nodes. https://issues.apache.org/jira/browse/NIFI-7255 Hope this helps, Matt
... View more
03-25-2020
02:37 PM
@Faerballert Perhaps you clone your flowfile before the mergeContent processors. So whichever relationship you are connecting to your current mergeContent, you drag a second connection containing that same relationship to a parallel notification flow. Down this parallel flow path you use a replaceText processor to replace the content with the value from the attribute you want to merge. Then you use a mergeContent processor on this path to merge these files using a "," as your delimiter. Then from this mergeContent you do you notification. You may also want to open an Apache Jira with your use case and desired improvement for the existing mergeContent. The more details the better. Hope this helps, Matt
... View more
03-24-2020
12:24 PM
@Koffi When a NiFi node attempts to connect to an existing NiFi cluster, there are three files that are checked to make sure they match exactly between the connecting node and the existing copies in the cluster. Those files are: 1. flow.xml.gz 2. users.xml (will only exist if NiFi is secured over https) 3. authorizations.xml (not to be confused with the NiFi authorizers.xml file. Will only exist if NiFi is secured over https) The output in the nifi-app.log of the node should explain exactly what the mismatch was the first time it tried to connect to the cluster. Hope this helps, Matt
... View more
03-24-2020
12:02 PM
@Faerballert The NiFi merge based processors only offer the option to "Keep Common Attributes" (keeps on attributes were every merged file has same attributes with same value) or "Keep all Unique Attributes" (same as above, but will also keep attributes that is unique. This means any attribute that exists in 1 or more of the merged FlowFiles were the value assigned to that attribute is the same in cases where attribute was found on more than one FlowFile being merged). There is no option to merge all attributes creating a comma separated list of unique values. What is the use case for such a n attribute merge need? There is no way to tell which value goes with which chunk of the merged data. Plus if the merged FlowFile were later split, every produced split FlowFile would have all the same FlowFile attributes. Hope this helps, Matt
... View more
03-24-2020
11:44 AM
@domR i) Do List processors w/ timestamp tracking store state locally? --- If you are running a standalone NiFi and not a NiFi cluster, all state will be stored locally on disk. --- If clustered, this depends on the list processor and how it is configured. The ListFile processor can be configured to store state locally or remotely depending on your use case. For example a ListFile is added to a NiFi cluster and every node is listing from a local path not shared across all nodes, you would want each node to store the listFile state locally since it would be unique per node and other nodes have no access to the directory one each node. If your listFile is listing against a mounted directory that is mounted to every node in the cluster, the listFile should be configured for remote and configured t run on primary node only. --- Other list based processors all store state locally ONLY when it is a standalone NiFi. Clustered NiFi installs will trigger store to be stored in zookeeper. ii) Does this state survive NiFi restarts? --- Yes, local state is stored on disk in NiFI's local state directory. Cluster/Remote state is store in zookeeper. State configurable is handled by the state-management.xml configuration file. iii) If running on primary node only, would this mean when another primary node is chosen, the List processor would list any files it hasn't tracked (and re-ingress a large backlog of files if still there)? --- When a primary node change occurs, the primary node only processors on the previous primary node are asked to stop executing and the same processors on the newly elected primary node are asked to start. On the new node, that processor will retrieve that last known state stored in zookeeper for that component before executing. There is a small chance for some limited data duplication. When old elected primary node processors are asked to stop that does not kill active threads. If the processor is in the middle of execution and does not complete (update cluster state in ZK) before newly elected primary node pulls cluster state when it starts to execute, some files may be listed again by newly elected node, but it will not list from beginning. iv) What out-of-box solutions can help to get around the issue of non-persisted non-distributed listing, or do we need custom auditing triggering individual listings? --- NiFi does persist state through node restarts. Note; You can right click on a processor that stores state and select "view state" to see what has been stored. You can also right click on a processor and select "view usage" to open the embedded documentation for that component. The embedded documentation will contain a "State Management:" section that will tell you if the component stores state and if that state is stored locally or cluster (ZK). Hope this helps, Matt
... View more
03-24-2020
11:10 AM
@Alexandros Going to ask the simple question first... There are FlowFiles traversing the processors in this newly instantiated flow from your template, correct? --- The next thought would then be around authorizations (assuming your NiFi is secured) 1. Is your user running the provenance query authorized to "view provenance" and "view the data" on the components? If these are set on the process group containing these processor components and not set on the components themselves, the components will inherit the polices from the process group. 2. Is this a NiFi cluster? If so, make sure your NiFi nodes are also authorized to "view provenance" and "view the data". When you authenticate in to NiFi and run a provenance query, that query is replicated to all nodes in your cluster. Those query results are then returned to the node on which the originating request was made. If that node is not authorized to view data returned from other nodes, it will not display. --- Then we need to make sure provenance it still working: While you are seeing provenance events displayed for your existing flow, are those returns recent? If you monitor the contents of your provenance_repository, do you see timestamps updating on the <num>.prov files? Need to make sure provenance has not stopped working for some reason. Also make sure you are using the WriteAheadProvenanceRepository implementation (should be default in 1.11) and not the PersistentProvenanceRepository implementation (configured in nifi.properties file). Hope this helps, Matt
... View more
03-16-2020
05:09 PM
1 Kudo
@Gubbi Depending on which processor is being used to create your FlowFile from you source linux directory, you will likely have an "absolute.path" FlowFile attribute created on the FlowFile. absolute.path = /users/abc/20200312/gtry/ You can pass that FlowFile to an UpdateAttribute processor which can use NiFi Expression Language (EL) to extract the date from that absolute path in to a new FlowFile attribute Add new property (property name becomes new FlowFile attribute): Property: Value: pathDate ${absolute.path:getDelimitedField('4','/')} The resulting FlowFile will have a new attribute: pathDate = 20200312 Now you can use that FlowFile attribute later when writing to your target directory in S3. I assume you would use the putS3Object processor for this? If so, you can configure the "Object Key" property with the following: /Users/datastore/${pathDate}/${filename} NiFi EL will replace ${pathDate} with "20200312" and ${filename} will be replaced with "gyyy.csv". Hope this helps you, Matt
... View more
03-06-2020
10:41 AM
@vikrant_kumar24 You would not configure your python script to write to an XML file on disk NiFi handles the FlowFile creation in the framework. Any data passed by your Python script to STDOUT will be populated into the content of the resulting flowfile passed to the output stream relationship of the ExecuteStreamCommand processor. Your script does not need to have any awareness of what. FlowFile is or how it is created. So you simply have your python script send the XML content to STDOUT and NiFi will take care of putting that content in to the FlowFile that will be produced and routed to the "output.stream" relationship of the processor. You can then use the updateAttribute processor the change the filename associated with that content. Hope this helps, Matt
... View more
03-06-2020
10:28 AM
@anil35759 If you create a NiFi template that includes a NiFi processor that references a controller service, that controller service will be included in the generated template. So if you import and instantiate that template on to the canvas of another NiFi, the controller service will be added as well and be associated with the processor instantiate from that same template. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates So I am not clear on why you need to import the controller service separately via the toolkit. Controller services are not mapped to processors. The association happens from the processor side. The processor is mapped to a specific controller service via the controller services assigned uuid. Keep in mind that anything you can do within the NiFi Ui can also be accomplished using NiFi rest-api endpoints. Using your browsers developer tools to capture the requests as you make them through the NiFi UI is a great way to learn how to interact with NiFi rest-api endpoints. The rest-api endpoints specific to your NiFi release version can be found under help within your install. Here are the Apache NiFi rest-api docs for the latest Apache release version: https://nifi.apache.org/docs/nifi-docs/rest-api/index.html Essentially what you need to do here is to update the processor configuration to reference the UUID of whichever controller service you want it to use. curl 'http://<nifi-hostname>:<nifi-port>/nifi-api/processors/b10fd083-0170-1000-0000-00007f7c905f' -X PUT --data-binary '{"component":{"id":"b10fd083-0170-1000-0000-00007f7c905f","name":"ExecuteSQL","config":{"concurrentlySchedulableTaskCount":"1","schedulingPeriod":"0 sec","executionNode":"ALL","penaltyDuration":"30 sec","yieldDuration":"1 sec","bulletinLevel":"WARN","schedulingStrategy":"TIMER_DRIVEN","comments":"","autoTerminatedRelationships":[],"properties":{"Database Connection Pooling Service":"e60cb24c-95c5-3a97-bcb9-9e537006317d"}},"state":"STOPPED"},"revision":{"clientId":"affea95c-0170-1000-988b-73bf756785b3","version":2},"disconnectedNodeAcknowledged":false} So you'll notice from above example, I am updating a processor's (ExecuteSQL) configuration so that the Database Connection Pooling Service is mapped to the UUID of my DBCPConnectionPool controller service's uuid (e60cb24c-95c5-3a97-bcb9-9e537006317d). Note: The "clientId" string can be anything. Hope this helps, Matt
... View more
03-05-2020
10:56 AM
@anil35759 It may be helpful if you can share the exact commands you are performing now to export and import your controller services. Thanks, Matt
... View more