Member since
07-30-2019
3471
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 175 | 06-03-2026 06:06 PM | |
| 471 | 05-06-2026 09:16 AM | |
| 902 | 05-04-2026 05:20 AM | |
| 512 | 05-01-2026 10:15 AM | |
| 639 | 03-23-2026 05:44 AM |
04-18-2022
05:54 AM
@Neil_1992 I agree that the first step here is to increase the open file limit for the user that owns your NiFi process. check your current ulimit by becoming the user the user that owns your NiFi process and executing the "ulimit -a" command. You can also inspect the /etc/security/limits.comf file. NiFi can open a very large number of open files. The more FlowFile load, the larger the dataflows, the more concurrent tasks, etc all contribute to open file handles. I recommend setting the ulimit to a very large value like 999999, restarting NiFi and seeing if your issue persists. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
03-09-2022
11:48 AM
1 Kudo
@Onkar_Gagre Is the $.name field unique for every record or do batches of records share the same $.name value? If they are not unique, did you consider using the ConsumeKafkaRecord processor feeding a PartitionRecord processor to split your records by common name values? This would still allow you to work with batches of records rather than and individual record per FlowFile. Also might be helpful if you shared the details of your end-to-end use case as that may give folks the ability to offer even more dataflow design options. Thanks, Matt
... View more
03-09-2022
11:30 AM
1 Kudo
@sachin_32 You can accomplish by utilizing the "Advanced UI" capability found in the UpdateAttribute processor. The advanced UI allows you to create Rules (think if these as an IF/Then capability). So you would setup 3 rules: 1. If current date falls on Mon - Fri, do X 2. if current date falls on Sat, do nothing 3. if current date falls on Sun, do Y Expression Language guide Below you can see I have created 3 rules (Day1-5, Day6, and Day7) Once you create a Rule, you need to provide a Condition (This is your boolean if statement) In this case I am using it to figure out what the current day of the week with 1= Monday and 7 = Sunday and seeing if the day of the week is prior to Sat or after Sat in the current week. If a rules condition (if statement) resolves to a boolean "true", then the configured "Actions" (then statement) are evaluated. For my "Day1-5" rule, I set: Condition: ${now():format('u'):lt(6)} Action: ${now():toNumber():minus(${now():format('u'):plus(1):multiply(86400000)}):toDate():format("EEE, dd MMM yyyy")} For my "Day6" rule, I set: Condition: ${now():format('u'):equals(6)} Action: ${now():format('EEE, dd MMM yyyy')} For my "Day7" rule, I set: Condition: ${now():format('u'):gt(6)} Action: ${now():toNumber():minus(86400000):toDate():format("EEE, dd MMM yyyy")} About above: - The "now()" function returns the current date. - 86400000 is the number of milliseconds in 1 day. - So first I get the current date and convert it to milliseconds using the "toNumber()" function. - Then for Day1-5, I am subtracting based on current day of week a multiple of days worth of milliseconds. - For Day6, I am doing nothing other than reformatting the current days date. - For Day7, I am just subtracting one day or 86400000 milliseconds No matter which rule is applied the final date format i choose to write to an attribute named "PreviousSaturday" on the FlowFile is formatted using java simple date format "EEE, dd MMM yyyy" Example: "Sat, 05 Mar 2022" If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
03-09-2022
09:18 AM
2 Kudos
@Harsh__Tanwar I am not clear on the exact failure you are trying to report on? If the processor is producing a Bulletin when the failure to read from eventhub occurs, you could set up the SiteToSiteBulletinReportingTask and have it send bulletins (of course it will capture all bulletins being produced by your NiFi) to a remote input port on your NiFi where you programmatically extract what you from the bulletin(s) and send an alert via perhaps a putEmail processor or send those bulletins to some external monitoring service to handle. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
03-07-2022
09:06 AM
2 Kudos
@richG Your observations are correct. The conversion script code does convert those properties. https://github.com/apache/nifi/blob/main/minifi/minifi-toolkit/minifi-toolkit-configuration/src/main/java/org/apache/nifi/minifi/toolkit/configuration/dto/RemotePortSchemaFunction.java I generated the following Apache NiFi jira: https://issues.apache.org/jira/browse/NIFI-9772 If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
01-31-2022
02:29 PM
@rafy You'll want to read up on the documentation on the Apache MiNiFi page here: https://nifi.apache.org/minifi/index.html Since MiNiFi does not provide a UI from which you can construct a NiFi dataflow, you will need to build the dataflow that you will use on your MiNiFi using a NiFi installation. The converter toolkit is what you can then use to change your NiFi dataflow template into the necessary MiNiFi yaml file. You may also find these community posts helpful: https://community.cloudera.com/t5/Support-Questions/How-send-data-from-nifi-to-minifi-same-config/td-p/325183 https://community.cloudera.com/t5/Community-Articles/Ingesting-Log-data-using-MiNiFi-NiFi/ta-p/248154 If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
01-31-2022
02:16 PM
@sachin_32 I would recommend utilizing the second option approach and make use of the UpdateAttribute advanced UI since you will need multiple rules. You'll need a rule that handles when day of week is less than 6 and another rule for when day of week is greater than 6. Thanks, Matt
... View more
01-31-2022
02:05 PM
@vk21 This question is not related to the original question in this post. I recommend starting a new question, so as to avoid confusion via a new conversation when this post already has an accepted solution. Thanks, Matt
... View more
01-26-2022
06:49 AM
2 Kudos
@rafy There are two commonly used methods used to get actively being written to logs from a source server. 1. Install a MiNiFi agent on the server that utilizes a tailFile processor that is configured to read the log file being produced and then sends those FlowFiles to your NiFi cluster for further processing. However, as you said, you can not install new services/software on this server, so that rules out this option. 2. Another option is to modify the logger on your source system so that in addition to it logging locally, it also sends log output to an external syslog server. IN this case that syslog server would be your NiFi cluster with a dataflow that uses the ListenSyslog NiFi processor. There is no way for NiFi to connect to a remote server and incrementally pull new lines from a file that is continuously being written to. Now if your source server rolls the logs, It is possible you could have your NiFi use the ListSFTP and FetchSFTP processors to consume those rolled logs. Downside here is this would no be real time processing of those logs since you are only consuming based on the log rotation configuration on the target server. And you cannot use these processors to consume the actively being written to log. Doing so means that NiFi would pull the entire contents of the log each time the processor executes rather than just the newest log lines. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
01-25-2022
01:28 PM
@RonMilne I recommend taking your initial CSV record file and partitioning by the "salesRepId" in to multiple new JSON records. This can be accomplished using the PartitionRecord processor utilizing a CSVReader and a JsonRecordSetWriter. Your PartitionRecord processor configuration would look like this: Your CSVReader would be configured something like this (you'll need to modify it for your specific record's Schema: Note: Pop-out shows the "Schema Text" property and don't forget to set "Treat First Line as Header" property to "true" The JsonRecordSetWriter would need to be configured to produce the desired JSON record output format. However, just leaving default configuration will out put a separate FlowFile for each unique "SalesRepId". If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more