About MattWho

MattWho · ‎09-14-2023

@davehkd Of your NiFi cluster is secured you'll need to make sure that the load balancer is configured with sticky sessions (also known as session persistence). This is needed because NiFi authentication (except certificate based mutual TLS authentication) issues a client and server side token. The issued client token gets passed by the client (browser) with every subsequent request made to NiFi. The corresponding server side token only exists on the specific NiFi node that handled the authentication. So if your LB routes subsequent requests to a different node, authentication will fail for that request. Many users setup LBs in front of NiFi so there is one URL that can direct to any number of nodes in the NiFi cluster that are all capable fo handling authentication and authorization. This ensures ease of access for example when a node in the cluster is down. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-14-2023

@LKB You would get better traction/feedback if you start your own community question. Your query is not very related to issue in this post. As far as the one question related to this post about encrypted manager password, @mks27 simply masked it by using "***" in his post. NiFi does not replace actual password with * when encrypting sensitive passwords. The NiFi Encrypt-Config Toolkit can be used to encrypt passwords used in various NiFi configuration files: https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#encrypt_config_tool Thank you, Matt

MattWho · ‎09-14-2023

@viskumar A Cloudera License is required to access Cloudera distributions located on archive.cloudera.com You can download open source versions of mahout from Apache: https://mahout.apache.org/ Hope this helps you, Matt

MattWho · ‎09-14-2023

@manishg I am not clear, can you share the processor type where the expected data is not present? The View Status History "Record Processed" will only appear on some component "Record" type processors (example: PartitionRecord processor). It would not exist as on processors that don't process "records". If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-14-2023

@Soli There is likelihood that once you upgrade that you will have ghosted processors (happens when you are using a processor that no longer exists in the newer release) or invalid components (happens when new version of same component class introduces new required properties that need to be configured). Something you may want to try is standing up a different NiFi on 1.23 with autoresume.state set to false in the nifi.properties. drop your flow.xml.gz in to that NiFi and start it. Everything will load up in stopped state. At least then you can validate al your processors are valid and none are ghosted. If this standalone can reach yoru redis, make sure you don't start processors that use it as it can mess up state recorded in your prod NiFi. You also did not mention if you have a standalone NiFi or a NiFi cluster setup. In a NiFi cluster, Zookeeper is also used by some components to store cluster wide state. So would not utilize same ZK in your test cluster setup used to check for component issues. At least you will be able to rule out flow design issues and be able to make note of what needs to be fixed in your production upgrade prior to doing the upgrade. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-14-2023

@manishg Copying a file should have resulted in a new timestamp making it newer than the last file you previously listed. However, moving a file typically does not change timestamps on the file. My guess here is you moved a file into the directory instead of copy? Matt

MattWho · ‎09-14-2023

@mks27 I am reading through this post and see multiple conflicting output shared from you which imply config changes were applied between updates added to this post. First you need to understand that NiFi authentication and NiFi authorization are two totally separate processes. After successful authentication is successful the user identity string is evaluated against any configured Identity mapping patterns configured in the nifi.properties file. IF a java regex mapping pattern matches against the user identity string returned during authentication, the configured associated identity mapping value is applied. At this point the user identity string is passed off to the configured authorizer configured in NiFi to verify that the user is authorized for the request endpoint being accessed. The authorizer must be aware of all user identity strings and those user must be authorized to the resource before a user will be authorized. It is IMPORTANT to understand that NiFi is case sensitive (Identity bob and BOB would be treated as two different users). Your initial query you stated that the NiFi UI shows successful authentication, but indicates that authorization was then not successful. We know this because it returned a user identity (determined during authentication) and then reported that user was not known to your NiFi during authorization verification. Unknown user with identity 'cn=Mohit Kumar,ou=FM-Users,ou=Managed services,dc=CORP,dc=SA,dc=ZAIN,dc=COM'. Contact the system administrator. In your same post you shared the DN from your ldapsearch response as: CN=Mohit Kumar,OU=FM-Users,OU=Managed services,DC=CORP,DC=SA,DC=ZAIN,DC=COM As we can see these do not match. Regardless of above, what NiFi received in response to your authentication request from your ldap is what is displayed in the NiFiUI. Now, in a later post you shared the nifi-user.log output below: 023-05-23 02:53:37,220 INFO [NiFi Web Server-21] o.a.n.w.a.c.AccessDeniedExceptionMapper identity[mohit.kumar], groups[] does not have permission to access the requested resource. Unknown user with identity 'mohit.kumar'. Returning Forbidden response. This log line implies that a user was successfully authenticated with a user identity of "mohit.kumar". This is not same user as shared in the initial post. My guess here us that changed your ldap-provider from using: <property name="Identity Strategy">USE_DN</property> to: <property name="Identity Strategy">USE_USERNAME</property> The "USE_USERNAME" is more commonly used. Upon successful authentication, the username entered at the NiFi login prompt is used as the user identity rather than the DN returned by ldap. Or you setup some Identity.mapping.pattern that matched in your full DN, extracted just the CN and set it to all lowercase? NiFi authorization is handled by the authorizers.xml NiFi configuration file. In your authorizers.xml you have the "Managed authorizer" which has a configured dependency on the "File-Access-Policy-Provider" which itself has a configured dependency on "File-User-Group-Provider". The File-User-Group-Provider is responsible for building the users.xml file and populating it with a few initial entries. This provider will ONLY generate a users.xml file if it does NOT already exist. So any edits to this configuration after the users.xml file already exists will not be reflected in this file. I see you have configured this provider to create the following user identity: <property name="Initial User Identity 1">CN=Mohit Kumar,OU=FM-Users,OU=Managed services,DC=CORP,DC=SA,DC=ZAIN,DC=COM</property> This Identity matches neither Identity mentioned earlier that resulted from successful authentication (remember that NiFi is case sensitive). I would recommend changing this to the following and deleting the users.xml so it gets recreated: <property name="Initial User Identity 1">mohit.kumar</property> Make sure you are also use "USE_USERNAME" in your ldap-provider. The File-Access-Policy-Provider is responsible for building the authorizations.xml file only if it does not already exist. Within this provider you defined who your initial admin user identity should be. When building the authorizations.xml file for the first time, this initial admin user identity will be granted the authorization needed to act as an administrator. <property name="Initial Admin Identity">CN=Mohit Kumar,OU=FM-Users,OU=Managed services,DC=CORP,DC=SA,DC=ZAIN,DC=COM</property> This should be changed to below and current authorizations.xml (not authorizers.xml) must be deleted so it can be rebuild based on new initial admin: <property name="Initial Admin Identity">mohit.kumar</property> Now restart your NiFi and login using "mohit.kumar" in the NiFi login window. I should note that I am assuming here that "mohit.kumar" is your users sAMAccountName value in your LDAP entry. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2023

@MmSs NiFi is data agnostic. To NiFi, the content of a FlowFile just bits. To remain data agnostic, NiFi uses what NiFi calls a "FlowFile". A FlowFile consists of two parts, FlowFile Attributes/Metadata (persisted in FlowFile repository and held in JVM heap memory) and FlowFile content (stored in content claims within content repository). This way NiFi core does not need to care or know anything about the format of the data/content. It becomes the responsibility of am individual processor component that needs to read or manipulate the content to understand the bits of content. The NiFi FlowFile metadata simply records in which content claim the bits exist and at what offset within the claim the content starts and number if bits that follow. As a far as directory paths go, these become just additional attributes on a FlowFile and have no bearing on NiFi's persistent storage of the FlowFiles content to the content repository. As far as the unpackContent goes, the processor will process both zip1 and zip2 separately. Unpacked content from zip one is written to a new FlowFile and same hold true for zip2. So if you stop the processor immediately after your UnpackContent processor and send your zip1 and zip2 FlowFiles through, you can list the content on the outbound relationship to inspect them before further processing. You'll be able to view the content and the metadata for each output FlowFile. NiFi does not care if there are multiple FlowFiles with the same filename as NiFi tracks them with unique UUID within NiFi. What you describe as zip1 content (already queued in inbound connection to PutS3Object being corrupted if zip2 is then extracted) is not possible. Run both zip 1 and zip2 through your dataflow with putS3Object stopped and inspect the queued FlowFiles as they exist queued before putS3Object is started. Are queued files on same node in your NiFi cluster? Is your putS3Object using "${filename}" as the object key? What happens if you use "{filename}-${uuid}" instead? My guess is issue is in your putS3Object configuration leading to corruption on write to S3. So your issue seems more likely to be a flow design issue then a processor of NiFi FlowFile handling issue. Sharing all the processors you are using in your dataflow and their configuration may help in pinpointing your design issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2023

@manishg NiFi has a High Availability at the control plane and not at the data level. For HA, NiFi utilizes a zero master structure at the NiFi controller level through the use of Zookeeper (ZK). An external ZK cluster is used for electing a NiFi node as the cluster coordinator. If the currently elected cluster coordinator goes down, ZK will elect a new cluster coordinator from the remainder available nodes still communicating with ZK. This eliminates a single point of failure with accessing your NiFi cluster. Any node in the cluster can be used for access. The individual nodes in a NiFi cluster load an execute their own local copy of the flow (flow.xml.gz -older versions and flow.json.gz -newer releases). Each NiFi node also maintains its own set of repositories (database, flowfile, content, and provenance). The flowfile and content repositories only contain metadata/attributes and content for FlowFiles that traverse the dataflows on a specific node. So node 1 has not information about the data being processed on node 2, 3, etc... When a node is down, the current queued FlowFiles on that node remain in its content and flowfile repositories until that node is brought back online or a new node is build where these repositories can be moved (you can not merge existing repositories from different nodes). So it is always best to protect the data stored in these repositories (especially content and flowfile) via RAID to prevent dataloss. As far as your last question about aggregation of processing on different nodes, yoru question is not clear to me. Each node operates independently with the exception of perhaps some cluster wide state which may be stored in ZK. Cluster wide state is primarily used by processors to prevent consumption of same data by different nodes (example listSFTP processor running primary node only and then a change in election happens resulting in different node being elected as primary node. New primary node would start primary node only processors who will retrieve last recorded cluster state and pickup where old primary node processors left off). It is responsibility of dataflow design engineer to construct dataflow(s) on the NiFi canvas that distribute data across the NiFi cluster for proper processing. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-08-2023

@OpenText-Orion SubjectAlternativeNames would not be full distinquished names (DNs). SANs are used to verify the within the TLS exchange that the client is connected to correct intended target server. This avoid things like man in the middle attacks. So essentially what you have are certificate you created for your 3 NiFi nodes: node1.server.name node2.server.name node3.server.name However, when you are trying to connect to a NiFi node, you are entering https://my.elb.name:<port>/nifi in your browser which is directed to a NiFi node. Without a SAN entry present that matches the target hostname, the TLS exchange assumes the request was not intended for this target host resulting in the exception you see. Recreate your node certificates using only Hostnames as the SAN entries instead of a full DN. I executed the following tls-toolkit.sh command you shared and it provided correct expected output: ./tls-toolkit.sh standalone -n 'node1.server.name,node2.server.name,node3.server.name' --subjectAlternativeNames 'my.elb.name' Alias name: nifi-key Creation date: Sep 8, 2023 Entry type: PrivateKeyEntry Certificate chain length: 2 Certificate[1]: Owner: CN=node1.server.name, OU=NIFI Issuer: CN=localhost, OU=NIFI Serial number: 18a76360ce500000000 Valid from: Fri Sep 08 19:12:48 UTC 2023 until: Thu Dec 11 19:12:48 UTC 2025 Certificate fingerprints: MD5: 75:70:0C:4F:41:D8:EA:9D:35:46:9E:C1:3B:9C:B0:E9 SHA1: 5C:0C:CC:B3:C8:29:62:05:5D:5B:C5:BB:71:39:20:40:48:CE:38:A5 SHA256: 17:79:FF:87:31:07:CB:9A:01:A5:82:03:A4:1B:3F:3D:F0:C3:79:21:C6:90:06:82:3D:FC:A1:0A:5F:64:DB:DE Signature algorithm name: SHA256withRSA Subject Public Key Algorithm: 2048-bit RSA key Version: 3 Extensions: #1: ObjectId: 2.5.29.35 Criticality=false AuthorityKeyIdentifier [ KeyIdentifier [ 0000: DA A8 38 36 C2 61 E3 CB DF 66 72 B5 FF D6 B7 F8 ..86.a...fr..... 0010: 92 2B 50 81 .+P. ] ] #2: ObjectId: 2.5.29.19 Criticality=false BasicConstraints:[ CA:false PathLen: undefined ] #3: ObjectId: 2.5.29.37 Criticality=false ExtendedKeyUsages [ clientAuth serverAuth ] #4: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ DigitalSignature Non_repudiation Key_Encipherment Data_Encipherment Key_Agreement ] #5: ObjectId: 2.5.29.17 Criticality=false SubjectAlternativeName [ DNSName: node1.server.name DNSName: my.elb.name ] #6: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 05 52 D3 51 9B 56 27 EB D2 C1 62 42 A9 43 39 EF .R.Q.V'...bB.C9. 0010: 3A 8E 0D 42 :..B ] ] Make sure you are looking at the PrivateKeyEntry certificate [1]. certificate[2] in the PrivateKeyEntry is the signing certificate. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎01-06-2026 01:41 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎01-06-2026 01:41 PM
Posts	3,410
Kudos received	1619

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: Possible to Use AWS Load Balancers in Front of...

Re: Nifi LDAP user login issue.

Re: Need access to the archive.cloudera.com

Re: 'Records Processed' missing for a processor

Re: NiFi upgrade 1.9.2 to 1.23.0

Re: Nifi: file not picked from input directory aft...

Re: Nifi LDAP user login issue.

Re: UnpackContent overwriting data

Re: nifi cluster: durability , aggregation

Re: NiFi Cluster on AWS EC2 With SSL Using tls-too...