Member since
07-30-2019
3421
Posts
1628
Kudos Received
1010
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 179 | 01-13-2026 11:14 AM | |
| 324 | 01-09-2026 06:58 AM | |
| 568 | 12-17-2025 05:55 AM | |
| 629 | 12-15-2025 01:29 PM | |
| 581 | 12-15-2025 06:50 AM |
02-21-2017
06:49 PM
2 Kudos
@Raj B 1. The main intent of NiFi Provenance is for data governance. The ability to look back at the life of a FlowFile. It can tell you where a FlowFile originated, from what parent FlowFile it was part of, how many parents FlowFiles where used to create it, What changes were made to it, where it was sent, when it was terminated from NiFi, etc... NiFi Provenance also provides a means to view or replay FlowFile's that are no longer anywhere in your dataflow (Provided the FlowFiles content still exists in the content repositories archive) at any point in your dataflow. Examples: - Some downstream system expected to receive file "ABC" over the weekend from NiFi. You can use NiFi's data provenance to see exactly when file "ABC" was received by NiFi and exactly what NiFi did to file "ABC" as it traversed your dataflows. - A FlowFile "XYZ" was expected to route through your dataflow to some destination "G". Upon searching Provenance it was discovered "XYZ" was routed down the wrong path. You could correct you dataflow routing issues and use data provenance to replay "XYZ" just prior to the dataflow correction. 2. NiFi's Provenance repository retains all Provenance events generated via your dataflow up until either retention time or max disk usage properties are met. When either of those conditions are met, the oldest provenance events are deleted first. There is no way to selectively decide which provenance events are retained in the repository. Using the 3. The Provenance API provides a means for running queries directly against the Provenance data stored local to a particular NiF instance. The SiteToSiteProvenanceReportingTask provides a way of sending provenance events to another system for perhaps longer term storage. Since provenance events do not contain any FlowFile content, only provenance events stored locally within a NiFi instance can be used to view or replay any content. Thanks, Matt
... View more
02-21-2017
04:37 PM
@mayki wogno One thing you could do is set "FlowFile Expiration" on the connection containing the "merged" relationship. And set the "Available Prioritizers" to " Newest FlowFileFirstPrioritizer". FlowFile expiration is measured against the age of the FlowFile (from creation time to now) and not how long it has been in a particular connection. If the FlowFile age exceeds this configured value, it is purged from the queue.
... View more
02-21-2017
03:44 PM
1 Kudo
@Andy Liang The ConsumeJMS and PublishJMS processors can be used with IBM MQ. They require you to setup an "JMSConnectionFactoryProvider" controller service to facilitate that IBM MQ connection. You will need to download the IBM MQ Client library on to the server where your NiFi is running. Matt
... View more
02-21-2017
03:09 PM
1 Kudo
@mayki wogno You can reduce or even eliminate the WARN messages by placing a MergeContent processor between your first and second DeleteHDFS processors that merges using "path" as the value to the "Correlation Attribute Name" property. The resulting merged FlowFile(s) would still have the same "path" that would be used by the second DeleteHDFS to remove your directory. Matt
... View more
02-21-2017
02:10 PM
That was the intent... It would only be successful after all files where deleted first. So only after the last file was removed would the directory deletion be successful.
... View more
02-21-2017
01:48 PM
@mayki wogno FlowFiles generated by the listHDFS processor all have a "path" attribute created on them: That attribute could be used to trigger you directory deletion via the DeleteHDFS processor. What is difficult here is determining when all data has been successfully pulled from an HDFS directory before deleting the directory itself. You could try using two DeleteHDFS processors in series with one another. The first DeleteHDFS deletes the files from the target "path" of the incoming FlowFiles and the second deletes the directory (Recursive property set to false). Matt
... View more
02-21-2017
01:25 PM
@Pradhuman Gupta The WARN and ERROR messages you see when you float your cursor over the red notification icon on a processor are also written to the nifi-app.log. There is no way to capture those bulletins directly from a processor and route them to a putEmail processor. If there are specific processor types for which you want to monitor for WARN and/or ERROR messages for, you could modify your NiFi's logback.xml file so that logs generated by those processors classes are written to their own output log file. You could then setup a dataflow that tails that new log and sends an email when WARN and/or ERROR log messages are written to it. Thanks, Matt
... View more
02-21-2017
01:16 PM
@mayki wogno Make sure the user your NiFi is running as is authorized to delete files and directories in your target HDFS. The DeleteHDFS processor properties are as follows: Thanks, Matt
... View more
02-16-2017
05:11 PM
1 Kudo
@Anshuman Ghosh The Search Value Regex above has 4 capture groups from a valid IP address. Each capture group can then be referenced in the replacement Value as $1, $2, $3, and/or $4. In the example above the replacement for each found valid IP is still the first two numbers followed by ".x.x". You can of course change the replacement value to whatever meets your specific needs. Thanks, Matt
... View more
02-16-2017
05:06 PM
1 Kudo
@Anshuman Ghosh I posted your question here for you.
... View more