Member since
07-30-2019
3390
Posts
1618
Kudos Received
999
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 232 | 11-05-2025 11:01 AM | |
| 465 | 10-20-2025 06:29 AM | |
| 605 | 10-10-2025 08:03 AM | |
| 396 | 10-08-2025 10:52 AM | |
| 443 | 10-08-2025 10:36 AM |
07-02-2024
11:54 PM
1 Kudo
Yes, almost same behavior is observed with retry strategy as "penalize". Just the additional penalty duration gets added into the time. For example by default the penalty duration is 30 secs, if incoming flow files are 10 and number of retries is 1. For this case 10 flow files are clubbed up and first retry happens at 50secs. Then for 30secs it penalizes the clubbed flow files. Then after 50secs it goes into the failure relationship. So, In total (numberOfRetries+1)*5secs*(numberOfInComingFlowFiles) + Penalty duration time taken by publishKafka processor to route file into failure relationship in case of penalize retry policy. If retry is not checked then similar behavior like yield is observed 5*numberOfIncomingFlowFiles secs to route to failure relationship as shown in photos. Penalty and yield settings are default only. target kafka version is 3.4.0 and number of partition is 1. Number of nifi nodes are 3. Number of concurrent Tasks on PublishKafkaRecord is 1, but the execution is on all nodes, which is I think 1 thread on 3 nodes each.
... View more
07-02-2024
01:00 PM
1 Kudo
@enam Have a slight mistake in my NiFi Expression Language (NEL) statement in my above post. Should be as follows instead: Property = filename
Value = ${filename:substringBeforeLast('.')}-${UUID()}.${filename:substringAfterLast('.')} Thanks, Matt
... View more
07-02-2024
07:33 AM
@Vikas-Nifi the following error is directly related to failure to establish certificate trust in the TLS exchange between NiFi's putSlack processor and your slack server: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target " The putSlack processor utilizes the StandardRestrictedSSLContextService to define keystore and truststore files the putSlack processor will use. The truststore must contain the complete trustchain for the target slack server's serverAuth certificate. You can use: openssl s_client -connect <companyName.slack.com>:443 -showcerts to get an output of all public certs included with the serverAuth cert. I noticed with my slack endpoint that was not the complete trust chain (root CA certificate for ISRG Root X1 was missing from the chain). You can download the missing rootCA public cert directly from let's encrypt and add it to the truststore set in the StandardRestrictedSSLContextService. https://letsencrypt.org/certificates/ https://letsencrypt.org/certs/isrgrootx1.pem https://letsencrypt.org/certs/isrg-root-x2.pem You might also want to make sure all intermediate CAs are also added and not just the intermediate returned by the openssl command just in case server changes that you get directed to. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-02-2024
06:59 AM
@greenflag Not knowing anything about this rest-api endpoint, all I have are questions. How would you complete this task outside of NiFi? How would you accomplish this using curl from command line? What do the REST-API docs for your endpoint have in terms of how to get files? Do they expect you to pass the filename in the rest-api request? What is the rest-api endpoint that would return the list of files? My initial thought here (with making numerous assumptions about your endpoint) is that you would need multiple InvokeHTTP processors possibly. The first InvokeHTTP in the dataflow hits the rest-api endpoint that outputs the list of files in the endpoint directory which would end up in the content of the FlowFile. Then you split that FlowFile by its content so you have multiple FlowFiles (1 per each listed file). Then rename each FlowFile using the unique filename and finally pass each to another invokeHTTP processor that actually fetches that specific file. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-01-2024
03:05 PM
1 Kudo
@NeheikeQ yes, newer version of 1.x NiFi-Registry will support older versions of NiFi version controlling to it. For NiFi after upgrade, load the flow.xml.gz on one node and start it. Then start the other nodes so that they all inherit the flow from the one node where you had a flow.xml.gz. At this point all nodes should join successfully and will have the same dataflow loaded. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-01-2024
02:55 PM
1 Kudo
@Dave0x1 Typically MergeContent processor will utilize a lot of heap when the number of FlowFiles being merged in a single execution is very high and/or the size of the FlowFile's attributes are very large. While FlowFiles queued in a connection will have the FlowFile attributes/metadata held in NiFi heap, there is a swap threshold at which time NiFi swaps FlowFile attributes to disk. When it comes to MergeContent, FlowFile are allocated to bins (will still show in inbound connection count). FlowFiles allocated to bin(s) can not be swapped. So if you set min/max num flowfiles or min/max size to a large value, it would result in large amounts of heap usage. Note: FlowFile content is not held in heap by mergeContent. So the way to create very large merged files while keeping heap usage lower is by chaining multiple mergeContent processor together in series. So you merge a batch of FlowFiles in first MergeContent and then merge those into larger merged FlowFile in a second MergeContent. Also be mindful of extracting content to FlowFile attributes or generating FlowFile attributes with large values to help minimize heap usage. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
06-18-2024
01:47 PM
That's a good idea, however low latency is a user requirement. Currently, processing each file from source to destination takes around one minute. If I add a two minute delay, the users would not be happy.
... View more
06-18-2024
01:24 PM
@omeraran If your source is continuously being written to you might consider using the GenerateTableFetch processor --> ExecuteSQLRecord processor (configured to use JsonRecordSetWriter) --> PutDatabaseRecord processor. Working with multi-record FlowFiles by utilizing the record based processor is going to be a more efficient and performant dataflow. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
06-14-2024
07:46 AM
1 Kudo
@Alexy Without seeing your logs, I have no idea which NiFi classes are producing the majority of your logging. But logback is functioning exactly as you have it configured. Each time the nifi-app.log reaches 500 MB within a single day it is compressed and rolled using an incrementing number. I would suggest changing the log level for the base class "org.apache.nifi" from INFO to WARN. The bulk of all NiFi classes begin with org.apache.nifi and by changing this to WARN to you will only see ERROR and WARN level log output from the bulk of the ora.apache.nifi.<XYZ...> classes. <logger name="org.apache.nifi" level="WARN"/> Unless you have a lot of exception happening within your NiFi processor components used in your dataflow(s), this should have significant impact on the amount of nifi-app.log logging being produced. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more