Member since
07-30-2019
3369
Posts
1616
Kudos Received
996
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
110 | 10-08-2025 10:52 AM | |
88 | 10-08-2025 10:36 AM | |
167 | 10-03-2025 06:04 AM | |
127 | 10-02-2025 07:44 AM | |
297 | 09-23-2025 10:09 AM |
05-28-2025
07:47 AM
@s198 I am happy to report that I have reviewed CountText with Cloudera's engineering team and Cloudera will officially support the CountText processor effective with CFM 2.1.7 SP2 and CFM 4.10 releases. The documentation has been updated to reflect this change: https://docs.cloudera.com/cfm/2.1.7/release-notes/topics/cfm-supported-processors.html https://docs.cloudera.com/cfm/4.10.0/release-notes/topics/cfm-supported-processors.html Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-28-2025
06:23 AM
@sydney- The exception you shared: "Unable to obtain listing of buckets: javax.net.ssl.SSLHandshakeException:PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" Indicates a TLS trust issue within the mutualTLS handshake between your NiFi and NiFi-Registry. Perhaps the complete trust chain is missing in the truststores. Looking at the output from OpenSSL may be able to help you see what is missing. openssl s_client -connect <nifi-regsitry-hostname>:<nifi-registry-port> -showcerts
openssl s_client -connect <nifi-hostname>:<nifi-port> -showcerts Need to get past this issue before dealing with and proxied user identity issues. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-28-2025
05:58 AM
@nifier Apache NiFi does not have any AS2 protocol specific processors. This may require building a new processor to handle this protocol. As far as the EDI file format, NiFi is built to be data agnostic so it can handle data of any format. NiFi can do this because it generates a FlowFile that consists of FlowFile content (physical content bytes) and FlowFile attributes/metadata (attributes and metadata about the content on the FlowFile. In this way NiFi can move A FlowFile from NiFi processor to processor with no dependency on content format. Only processors components that need to interact with the content of a FlowFile will need to understand the content's format. This is where the custom processor components may need to created depending on what you are trying to accomplish as part of yoru use case. If you have no need to read or modify your EDI formatted content within your NiFi dataflow(s), then the only custom processor you would need is one that can handle the AS2 file transfer protocol. You could raise a Jiras within the Apache NiFi jira project (https://issues.apache.org/jira/browse/NIFI) with details and request help from that community in developing AS2 protocol processors. If there is enough interest, someone may assist there or you can contribute to the Apache NiFi project yourself. You'll need to create an account before you can create a a new jira in that project. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-21-2025
06:53 AM
1 Kudo
@TyTheNiFiGuy On first startup of NiFi the flow.json.gz does not yet exist when the authorizers.xml is executed, so the root process group is not yet known to the file-access-policy-provider in order to create the initial policies for that root process group generated ID. This is why you see all the icons greyed out above the NiFi canvas. When you tried to construct the authorizations.xml manually you are using that root process group ID manually created in a different startup. If the flow.json.gz does not exist during that startup, when created it will not have that same ID and thus greyed out icons again above the canvas in the NiFi UI. So you can only use this constructed authorizations.xml if you are also using the initial generated flow.json.gz along with it that uses that ID for the root Process group. This is really not a blocker. Your initial Admin user can access NiFi and does have full permissions on "/policies" which gives that user ability to set new policies, including for itself. Simply right clicking within the generated canvas (root Process group) and click "manage access policies" will allow hat user to set authorizations (view the component, modify the component, etc) on the root process group needed to make icons all appear. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-21-2025
06:32 AM
@melek6199 Let me try to address each of your statements as there appears to be some misunderstanding of how authentication and authorization works between NiFi and NiFi-Registry. I have a 3-node NiFi cluster, and I want to manage it using NiFi Registry. I configured both NiFi and NiFi Registry with a single certificate using the TLS Toolkit. I also set up LDAP integration. I can successfully connect to both NiFi and NiFi Registry individually using my LDAP users. NiFi-Registry does not manage your NiFi cluster. These are two different services. NiFi-Registry is used by NiFi to version control process groups created and managed in NiFi. It is not a security best practice to use one certificate for all your servers. You should have one certificate per server. If you have two services (A NiFi node and NiFi-Registry on the same server, they would both use the same certificate). In production I would recommend using certificates signed by actual legitimate signing authorities versus TLS toolkit generated certificates and truststore) Your keystores must meet the following requirements: Contain only one PrivateKey Entry That PrivateKey entry supports both ClientAuth and ServerAuth ExtendedKeyUsage (EKU). (Note: NiFi-Registry does not require ClientAuth, but no harm in having it) Contains at least one SAN entry that matches the server's hostname on which the certificate is being used. Your NiFi/NiFi-Registry Truststore must meet following requirements: Contain a TrustedCertEntry for every signer/issuer of the certificates passed in a mutualTLS handshake (The compete trustchain for every certificate that will be used to communicate between NiFi node and with NiFi-Registry). You can use the NiFi TLS toolkit to generate 4 keystores and 1 truststore you can use with your NiFi and NiFi-Registry services, but make sure you are running with the "--subjectAlternativeNames" option. Those SAN should include the hostnames of the servers on which the services will run. (Now technically you could create one certificate with SANs for all the hosts and then use that one cert on all hosts, but as I said, not a security best practice). However, the LDAP user that I added and authorized in the Registry does not appear in NiFi. With the certificate user, I can view the bucket in NiFi Registry from NiFi and perform flow version control. But I cannot do this with my LDAP user. The user that authenticates into NiFi-Registry does not need to exist in NiFi; however, any authenticated user identity authenticated into NiFi must exist and have proper authorization in NiFi-Registry in order to conduct version control operation within NiFi. When you ldap-user authenticates into NiFi you will see that user's "user identity" displayed in the upper right corner (Keep in mind that your user is only authenticated into the NiFi node you access the cluster from and not all the NiFi nodes). When that user attempts to start version control on a process group, NiFi connects and authenticates with NiFi-Registry via a MutualTLS exchange/handshake. In that connection it will proxy the request on behalf of that "user identity" (case sensitive). This means that not only do the NiFi node clientAuth certificates need to be authorized in NiFi-Registry to read on "Can Manage Buckets " and read,write,delete on "Can Proxy Requests", the NiFi "user identity" need to be authorized on any bucket you want that "user identity" to be able to use for version control. (let me know if you need help with how a mutualTLS handshake works) Since your NiFi authenticated ldap "user-identity" has not been added and authorized in NiFi-Registry on any buckets, nothing will appear in the list of available buckets for that "user-identity" in NiFi. NOTE: Even if I generate separate certificates for NiFi and NiFi Registry and trust each certificate independently, the certificate user does not have permission to view the bucket. This is because the certificate user from the Registry is also not created in NiFi. For this reason, I generated both from the same certificate. From what i shared in response to two section above, you can see that the certificates used by the NiFi host are only used to proxy requests to NiFi-Registry on behalf of the "user identity" authenticated with NiFi. There are few things that don't make sense to me in your shared NiFi-Registry configuration: NiFi-Registry identity-providers.xml: I see you set " <property name="Authentication Strategy">SIMPLE</property>", yet your ldap URL is "ldaps". This should then also be set to "LDAPS". NiFi-Registry authorizers.xml: You are using the file-user-group-provider which allows you to manually define an initial set of "user identities" on first startup (node edits after to this config will happen if the users.xml already exists during startup). This provider also allows for the adding of additional "user identities" later via the NiFi-Registry UI directly. NOTE: There is also an available ldap-user-group-provider that can be used to sync select users "user identity" and groups "group-identity" from ldap into your NiFi-Registry list of identities. This is helpful if you don;t want to manage your ldap user and group identities manually within NiFi and NIFi-Registry. You are the file-access-policy provider which only created the authorizations.xml file if it does not already exist on startup. In it I can see "Initial Admin Identity">CN=nifi_amadeus_admin is set; however, in your ldap-provider you have configured "Identity Strategy">USE_USERNAME. I can only assume you did similar in your NiFi setup? It is unlikely that when you are logging into your NiFi you are typing the username as "CN=nifi_amadeua_admin" since this would not be the expected value in the "sAMAccountName" ldap field/attribute. That means your initial admin "user identity" does not match the identity of your authenticated user (unless you have this set because you are using the a certificate to auth in to the services with the above). In the end, there are the following key things that need to know: user-identities must match and are case sensitive. ("Bob" and "bob" would be treated as two unique user identities. So the user identity as displayed in upper right corner in NiFi UI must be authorized on specific bucket(s) in NiFi-Registry in order to successfully use version control in NIFi. This you do not have setup correctly yet. NiFi nodes also need to be properly authorized in NiFi-Registry for mange buckets and all proxy permissions. The node's user identity is comes from the NiFi node's clientAuth certificate full DN. That full DN can be modified through the use of the identity.mapping properties in the NiFi-Registry.properties file. Note: In your shared NiFi-Registry.properties file the identity.mapping properties are commented out are not in use, so full DN of NiFi node would be used as node's user identity and need to be authorized which corresponds to with full DN used in the file-access-policy provider you have configured. I know above is a lot of information, but wanted you to fully understand how the authentication and authorization between NiFi and NiFi-Registry works. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-20-2025
12:50 PM
@AndreyDE Glad I could help. Using just controlRate by itself has its limitation as well because it does not optimize for performance/throughput. You are setting some rate at which batches of FlowFiles will be passed downstream. Doing just this means multiple things can happen: 1. Rate is to short resulting in additional batches being passed downstream before previous batch completed processing. This could potentially lead to large backlogs in flows affecting downstream processing just as you were experiencing previously. 2. Rate is set to long resulting in downstream processing of a batch completing well before ControlRate releases next batch. Results in slower overall throughput. 3. If some exception occurs in downstream processing, nothing would prevent additional batches from being released into that downstream processing creating a huge backlog. The above are handled by the slightly more complex option C. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-20-2025
10:16 AM
@BobKing ListFile only creates 0 byte (no content) FlowFiles that must be sent to a FetchFile processor to retrieve and add the content to the FlowFile. The List<abcC> and Fetch<abc> type processors should be used instead of Get<abc> type processor when working In a multi-node NiFi cluster setup. These processor allow you to run list<abc> on primary node only, load-balance the listed FlowFiles across all nodes in the NIFi cluster, and then fetch<abc> the content. Spread the workload across the cluster for ingest types that don't support cluster setups (ListFile, ListSFTP, ListFTP, etc). TIP: Your attached image was so small I could not read the processor types. When you add an image to a new post you can click on it and expand it size to make it larger by dragging form the corner of the image before hitting "Reply". Thanks, Matt
... View more
05-20-2025
10:07 AM
@AndreyDE NiFi Connection Backpressure can not trigger changes in configuration of upstream processor. Often the client library dictates what happens when the client is executed and NiFi may have no control over number returned. The NiFi scheduler that is responsible for giving a processor component a thread to execute looks at the downstream connection(s) coming off a processor and if any are applying Backpressure, it will not give that upstream processor a thread. So Backpressure thresholds are soft limits. So when ListSFTP gets a thread it executes the SFTP client and it returns all files based on the filtering configurations. Number 1 and number 2 are not possible with ListSFTP due to limitations I described above. For number 3 you have a couple options: A) You could place a ControlRate processor between ListSFTP and FetchSFTP to control the rate at which FlowFiles are moved between those processors. You can then throttle the rate at which FlowFiles are moved from the connection feeding from ListSFTP to your downstream dataflow allowing time for those to process before next batch is passed downstream. B) Have your ListSFTP connect to a child process group which you configure "FlowFile Concurrency" settings set to "Single FlowFile Per Node". You can then place yoru downstream processing dataflow in this child process group which would not allow another FlowFile to enter the process group until FlowFile in that process group is either auto-terminated or exited the process group. The "Single FlowFile Per Node" concurrency setting means you would not have ability to configured multiple concurrent FlowFile processing which makes this a less desirable option. C) Combination of both A and B ListSFTP feeds a ControlRate processor (configure to allows 500 FlowFiles per minute) that lets batches of 500 FlowFiles to move from ListSFTP connection queue to connection queue connecting to a Child Process group. Configure the backpressure threshold on the connection feeding the child PG to 500 also so Backpressure gets applied as soon as controlRate allows a batch through. This backpressure will prevent ControlRate from getting scheduled again until this queued batch gets consumed by the child process group. On this child process group you also configured "FlowFile Concurrency" except configured with "Single Batch Per Node" which will allow this Process group to consume all FlowFiles from the inbound connection at once. It will then not consume again from the inbound connection until All FlowFiles are empty from the child process group. This design method control size of batches being processed at one time in the child process group while still allowing concurrent execute of multiple FlowFiles by processor components within the child process group. Option C dataflow would look something like this: Here you can see no more then 500 being processed in the child Process Group at one time. Back pressure of 500 on the connection feeding the Child Process group is set at 500 preventing ControlRate from adding another 500 to that connection until it is emptied when the Child process group accepts the next batch only happens when FlowFile count inside child process groups hits 0. Inside the process group is where you build yoru dataflow to fetchSFTP and process however you need the batches of FlowFiles. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-19-2025
11:33 AM
@blackboks Yes, that is correct unless you can sync user identity to group identity associations via one of the available user-group-providers available in NIFi/NiFi-Registry. NiFi System Administrator Guide FileUserGroupProvider LdapUserGroupProvider AzureGraphUserGroupProvider Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-19-2025
10:22 AM
@BobKing I have not been able to reproduce. I downloaded WinZip, created a simple text file and then zipped it. I then consumed that .zip file using CFM 2.1.7.1001 and was able to successfully use UnpackContent to unpack the text file. I recommend, if you have a support contract with Cloudera, that you create a support case where you can share more detail about the problematic zip files and an example file if possible. I suspect the issue is specific to the zip files you are working with. UnpackContent does not support mulit-part zip files (NIFI-10654). Thank you, Matt
... View more