About MattWho

MattWho · ‎07-15-2024

@Ali_12012 The documentation for InvokeHTTP states that only POST, PUT and PATCH http methods will sent with a body. The processor does not support sending a body with GET http method. Only supports headers. You may need to build a custom processor for your use case or perhaps use one of the scripting processors to accomplish your use case. Thank you, Matt

MattWho · ‎07-11-2024

@kellerj CFM has several Service pack versions released for 2.1.5, as well as newer CFM 2.1.6 and CM 2.1.7 versions. If you open the cluster UI via the NiFi UI --> global menu upper right corner) and then click on the "View Details" icon to far left of node that is disconnecting, what Node Events are being reported? Matt

MattWho · ‎07-02-2024

@enam Have a slight mistake in my NiFi Expression Language (NEL) statement in my above post. Should be as follows instead: Property = filename Value = ${filename:substringBeforeLast('.')}-${UUID()}.${filename:substringAfterLast('.')} Thanks, Matt

MattWho · ‎07-02-2024

@Vikas-Nifi the following error is directly related to failure to establish certificate trust in the TLS exchange between NiFi's putSlack processor and your slack server: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target " The putSlack processor utilizes the StandardRestrictedSSLContextService to define keystore and truststore files the putSlack processor will use. The truststore must contain the complete trustchain for the target slack server's serverAuth certificate. You can use: openssl s_client -connect <companyName.slack.com>:443 -showcerts to get an output of all public certs included with the serverAuth cert. I noticed with my slack endpoint that was not the complete trust chain (root CA certificate for ISRG Root X1 was missing from the chain). You can download the missing rootCA public cert directly from let's encrypt and add it to the truststore set in the StandardRestrictedSSLContextService. https://letsencrypt.org/certificates/ https://letsencrypt.org/certs/isrgrootx1.pem https://letsencrypt.org/certs/isrg-root-x2.pem You might also want to make sure all intermediate CAs are also added and not just the intermediate returned by the openssl command just in case server changes that you get directed to. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎07-02-2024

@greenflag Not knowing anything about this rest-api endpoint, all I have are questions. How would you complete this task outside of NiFi? How would you accomplish this using curl from command line? What do the REST-API docs for your endpoint have in terms of how to get files? Do they expect you to pass the filename in the rest-api request? What is the rest-api endpoint that would return the list of files? My initial thought here (with making numerous assumptions about your endpoint) is that you would need multiple InvokeHTTP processors possibly. The first InvokeHTTP in the dataflow hits the rest-api endpoint that outputs the list of files in the endpoint directory which would end up in the content of the FlowFile. Then you split that FlowFile by its content so you have multiple FlowFiles (1 per each listed file). Then rename each FlowFile using the unique filename and finally pass each to another invokeHTTP processor that actually fetches that specific file. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎07-02-2024

@NIFI-USER Are you seeing same behavior even when not using retry strategy of "yield"? What about when retry is not checked? FlowFiles, upon failure, should immediately be transferred to the connection containing the failure relationship. What are your penalty and yield settings set to on your PublishKafkaRecord_1_0? What version is your target Kafka (you are using a rather old Kafka client version 1.0)? As far as your Kafka topic goes, how many partitions on the topic? How many concurrent tasks set on PublishKafkaRecord? How many nodes in your NiFi cluster? Thanks, Matt

MattWho · ‎07-02-2024

@Heiko Thanks for sharing. The choice between "USE_USERNAME" and "USE_DN" needs to be evaluated against the specific structure of the end user's LDAP/AD. With AD, the user commonly logs in with their sAMAccountName and very often the sAMAccountName value is not the same string used within the user's DN. While you would still be able to login using your sAMAccountName and password, the user identity passed to the authorizer would be the CN value form that full DN (Your regex assumes the CN consists of only upper or lower case letters and numbers which may not work for all DNs). Then with the switch to using the CN from the DN, you need to consider equivalent changes in the ldap-user-group-provider in authorizers.xml. You'll need to make sure whatever user identity strings come out of authentication through DN are properly mapped to group identities. Both solutions will work and both solutions need careful evaluation to setup. I typically find using USE_USERNAME more consistent in structure (LDAP and AD), and thus less impacted by corner case oddities that using USE_DN can introduce. Thanks again for your contributions to the community. There is often more then 1 way to solve most queries in Apache NiFi. Matt

MattWho · ‎07-01-2024

@NeheikeQ yes, newer version of 1.x NiFi-Registry will support older versions of NiFi version controlling to it. For NiFi after upgrade, load the flow.xml.gz on one node and start it. Then start the other nodes so that they all inherit the flow from the one node where you had a flow.xml.gz. At this point all nodes should join successfully and will have the same dataflow loaded. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎07-01-2024

@Dave0x1 Typically MergeContent processor will utilize a lot of heap when the number of FlowFiles being merged in a single execution is very high and/or the size of the FlowFile's attributes are very large. While FlowFiles queued in a connection will have the FlowFile attributes/metadata held in NiFi heap, there is a swap threshold at which time NiFi swaps FlowFile attributes to disk. When it comes to MergeContent, FlowFile are allocated to bins (will still show in inbound connection count). FlowFiles allocated to bin(s) can not be swapped. So if you set min/max num flowfiles or min/max size to a large value, it would result in large amounts of heap usage. Note: FlowFile content is not held in heap by mergeContent. So the way to create very large merged files while keeping heap usage lower is by chaining multiple mergeContent processor together in series. So you merge a batch of FlowFiles in first MergeContent and then merge those into larger merged FlowFile in a second MergeContent. Also be mindful of extracting content to FlowFile attributes or generating FlowFile attributes with large values to help minimize heap usage. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎07-01-2024

@Trilok The older flow.xml.gz format was deprecated as of Apache NiFi 1.16 in favor of the newer flow.json.gz format. NiFi 1.16+ will only load the flow.xml.gz if the flow.json.gz does not already exist during startup. Upon successful startup, NiFi will generate the flow.json.gz. The NiFi 1.16+ version will still generate both the flow.xml.gz and flow.json.gz formats with every change made on the UI. With the major release of Apache NiFi 2.x, the deprecated flow.xml.gz format was removed. There is no option in NiFi 2.0 to support the older flow.xml.gz format. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎05-18-2026 11:55 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎05-18-2026 11:55 PM
Posts	3,470
Kudos received	1637

Cloudera Community

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: FetchSMB not fetching all files

Re: Nifi: How to revoke the import and export Temp...

Re: Nifi 2.0.0 \ Invoke HTTP process (Get) with bo...

Re: Failed to connect node to cluster because loca...

Re: how ot change file name moving another locati...

Re: NiFi Slack Integration issue - 1.26.0

Re: Apache Nifi: How to get all data csv in folder...

Re: Apache Nifi PublishKafka Retry Mechanism in ca...

Re: nifi login case sensitivity

Re: NiFi node disconnection from Cluster + Diff in...

Re: NiFi high jvm heap utilization on primary node

Re: Nifi 2.0-M2 Support with flow xml gz