About MattWho

MattWho · ‎06-21-2023

@jisaitua Definitely some weird unexpected behavior there. I was able to reproduce your issue, as well as some other unexpected behavior around same area. As a workaround to your specific issue, the following NiFi Expression Language (NEL) will get you what you are looking for: {${literal('')}"$${key}":"${literal('value')}"} I also filed an Apache NiFi jira for this issue: https://issues.apache.org/jira/browse/NIFI-11738 I reproduced on an Apache NiFi 1.18 cluster I have running, so it is not an issue that just recently appeared. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎06-16-2023

@Kiranq 1. What version of NiFi and NiFi-Registry are you using? 2. How is your GitFlowPersistenceProvider configured? Matt

MattWho · ‎06-16-2023

@hule Welcome to the community. I am not completely clear on your use case from your description. What are you trying to accomplish? Are you trying to generate emails within NiFi and send to an external outlook email address or are trying to have your NiFi receive emails? It sounds like maybe sending an email from a NiFi dataflow to some email address. If so, the processor you want to use is the putEmail processor. As far as the SMTP settings, those need to come from the target SMTP server. For example: Microsoft provides SMTP settings here: https://support.microsoft.com/en-gb/office/pop-imap-and-smtp-settings-8361e398-8af4-4e97-b147-6c6c4ac95353 Google Gmail provides SMTP settings here: https://support.google.com/a/answer/176600?hl=en If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎06-16-2023

@MOUROU Is your NiFi configured to support Oauth2 based user authentication? It looks more like you are using either kerberos-provider or ldap-provider fro user authentication. My suggestion to create a client certificate and use a SSLContext service for client authentication for an automated dataflow like this is because: 1. No need to obtain an token ever. 2. Certs can be created with long expiration time. 3. Tokens are NiFi node specific (same token is not valid for a different NiFi node in a the same NiFi cluster). 4. Same certs works no matter which node in the cluster the invokeHTTP connects with. Matt

MattWho · ‎06-15-2023

@MOUROU I'd recommend using a clientAuth certificate fro interacting with the NiFi rest-api. Certificate based authentication via a mutualTLS exchange (always enabled in a secure NiFi) is already how NiFi nodes communicate with one another. Using certificate does not require the extra step if obtaining a token, Token is only valid for use with the NiFi node that issues it. Certificated can be created with long expiration times (typically valid for 1 or 2 years by default) If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎06-15-2023

@steven-matison Looking at your flow-definition, I see that your invokeHTTP processor are configured to use an SSLContextService. I am assuming that SSLContextService is configured with the HTTPS enabled NiFi keystore and Truststore. When you then access the https rest-api endpoint, NiFi in the TLS exchange WANTs a client certificate which would be provided via the SSLContextService. I am then guessing your NiFi servers have been authorized to access that rest-api endpoint. You are correct that you would not need an Access Token since authentication was handled via a mutual TLS exchange with NiFi. Using certificates is actually the recommend method for interacting with the NiFi rest-api for a number of reasons: - No need for extra step to get a token. - Token is only valid for the specific NiFi node that issued it. Thanks, Matt

MattWho · ‎06-15-2023

@noncitizen Welcome to the Community!!! Apache NiFi is a very large open source project. Over the 8+ years since it was originally open sources it has grown so large that the download distribution has reached the max allowable size and does not include all components that the community has developed for NiFi. There are more then 400+ unique components developed for NiFi and growing every year. Many of these "add-on" components can be found in various open source repositories and NiFi makes it every easy to add them to your NiFi (even hot loading is possible). As is true with many open source products with lots of contributors, the documentation usually comes after the development and may at times be lacking in detail. Sometimes this because the originator could not anticipate all the possible use cases for a given component or being so close to the development there is good amount of self inferred knowledge and understanding. I myself have been working with NiFi for more then 8 years and have been exposed to many use cases, bugs, improvements, etc. I look forward to seeing you more in the community as you learn and grow and begin to help others using that new found knowledge.

MattWho · ‎06-15-2023

@Kiranq This is what I believe you executed from yoru description: You were originally using the the FileSystemFlowPersistenceProvider and has already version controlled one or more NiFi Process Groups (PG) to your NiFi-Registry? Then you change your configuration to use the GitFlowPersistenceProvider? This is what I believe you did not do: Prior to switching flow persistence providers, did you stop version control on all your PGs? Switching from other Flow Persistence Provider In order to switch the Flow Persistence Provider, it is necessary to reset NiFi Registry. For example, to switch from FileSystemFlowPersistenceProvider to GitFlowPersistenceProvider, follow these steps: Stop version control on all ProcessGroups in NiFi Stop NiFi Registry Move the H2 DB (specified as nifi.registry.db.directory in nifi-registry.properties) and Flow Storage Directory for FileSystemFlowPersistenceProvider directories somewhere for back up Configure GitFlowPersistenceProvider provider in providers.xml Start NiFi Registry Recreate any buckets Start version control on all ProcessGroups again. It appears as though you may not have followed the above documented steps in the NiFi-Registry admin guide (https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#flow-persistence-providers). This leaves you with your Metadata Database with info about your version controlled PGs and that metadata references flows persisted that it can no longer find in the flow persistence provider configured. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎06-14-2023

@noncitizen MergeContent processor. A "bin" is a virtual container in which FlowFiles are assigned during execution of the mergeContent processor. FlowFiles that are allocated to "bin(s)" will remain in NiFi heap memory and can not be swapped out to disk. How FlowFiles are allocated to bins from inbound connections during execution depends on the configured "merge strategy". Bin-Packing Algorithm - Will allocated FlowFiles to one bin until that bin has reached the configured mins (min num entries and min group size). If a FlowFile cannot be allocated to a bin (for example doing so would mean exceeding the configured max group size), then the FlowFile will be allocated to a second bin. Defragment - use case specific that is dependent on source FlowFiles having specific attributes about each fragment (fragment.identifier, fragment.index, fragment.count, and segment.original.filename). A new bin is used for each unique fragment.identifier FlowFile attribute value. For your use case description, you would be using "bin-packing algorithm" merge strategy. When MergeContent executes (0 secs means execute as often as possible), it would look at the unallocated FlowFiles in one of the inbound connections at the exact moment in time and allocate those to an existing bin or bins depending as described previously. At the end of binning the FlowFiles, it looks to see if any bins are eligible to be merged. MergeContent will merge a bin when any one of the following is true: Both mins have been met for the bin (min num entries AND min group size). Min group size is ignored if blank. Bin contains all fragments of a fragmented FlowFile (merge strategy = defragment only) Bin has reached configured max bin age (max bin age forces the merge of a bin after configured amount of time, in age starts upon first allocated FlowFile. This prevents a bin that never reached the configured mins from sitting un-merged indefinitely. If all bins have FlowFiles allocated to them and next unallocated FlowFile can not be allocated to one of these existing bins (oldest bin is forced to merge to free a bin in which that new FlowFile will get allocated). When merge strategy = defragment, oldest bin of FlowFiles is routed to "failure" relationship instead of forced merge to free a bin. I suspect that by having only 1 bin, a forced merge is happening in some of you tests. In others the min(s) are set too low and bin becomes eligible for merge before all FlowFiles have been allocated to the bin. (You reported this worked once and probably because you had all 63 CSVs queued in the inbound connection before you started the mergeContent and other times when it failed all components were running as data streamed through your dataflow). The mergeContent processor has no idea how many FlowFiles should go into a bin (unless merge strategy = defragment). Also keep in mind that multiple nodes in a NiFi cluster execute dataflows independently of other nodes in the cluster. Each node has its own copy of the flow.json.gz loaded in memory, each node has its own content and FlowFile repositories, and each node executes only on the FlowFile present on that node. So if you have multiple nodes ingesting data that you want to merge in to a single FlowFile (zip), then the use of "single node" load balanced connection prior to mergeContent processor is correct approach. So now lets look at what configuration would mostly likely work for you: Merge Strategy = Bin-Packing Algorithm Merge Format = zip Correlation Attribute = <blank> since you are not trying to divide incoming FlowFiles into different bins. min number of entries = 100 (since you are trying to make sure all 63 FlowFiles make it in to the bin regardless of how many processor executions it takes to accomplish that) max number of entries = 1000 (default) max bin age = 2 mins (set this high enough that you feel confident all FlowFiles will reach inbound connection prior to bin being forced to merge. default is blank and depending on server resources could mean this processor executes many times per second) max number of bins = 5 (default) I never recommend having only 1 bin. All other properties are defaults. What this does is allows 2 mins for all 63 of your FlowFiles to get placed in one bin before the max bin age kicks in and forces that bin to merge. OF course you can adjust this after testing (You have source FlowFiles that are already CSV but you have others that need to be unpacked which may delay them reaching mergeContent even if it is milliseconds. Even that short delay could mean different executions of the mergeContent try to bin and merge). Also single node is important if yoru FlowFiles are spread across all yoru cluster nodes since MergeContent can only merge those on same node. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎06-14-2023

@SandyClouds The changeCaptureMySQL processor uses a third party client for connecting to SWL and reading the binlogs. https://github.com/zendesk/mysql-binlog-connector-java I don't see that this client library has the ability to read encrypted binlogs. I encourage you to open an Apache NiFI jira for this product improvement to see if there is other interest for this within the Apache NiFi community. Such an improvement may require identifying a different client library that can support the new encrypted binlogs capability offered in MySQL 8.0+. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

Online	Offline
Last Visited	‎11-18-2025 07:56 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎11-18-2025 07:56 AM
Posts	3,406
Kudos received	1619

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: Escaping EL Issue

Re: Failed to register flow with Flow Registry due...

Re: Setup SMTP Apache Nifi mail server

Re: Using NIFI OAuth2 access Token provider

Re: Using NIFI OAuth2 access Token provider

Re: Using NIFI OAuth2 access Token provider

Re: Using MergeContent to create ZIP archive

Re: Failed to register flow with Flow Registry due...

Re: Using MergeContent to create ZIP archive

Re: How Nifi could read encrypted Binlogs ?