Member since
07-30-2019
3472
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 220 | 06-03-2026 06:06 PM | |
| 511 | 05-06-2026 09:16 AM | |
| 967 | 05-04-2026 05:20 AM | |
| 567 | 05-01-2026 10:15 AM | |
| 676 | 03-23-2026 05:44 AM |
09-22-2022
06:24 AM
@ImranAhmed Can you share a screenshot of your dataflow and the configuration of your replaceText processor? You mention xml. Is your source File that you are trying to perform replace text on an XML format file? If so, that is not a text based (ASCII) content file. In that case, your configured search value is probably not matching on anything and this nothing gets changed in the original content binary content. NiFi is a content agnostic application. This means that NiFi can ingest any type of data. It does this by wrapping that content in to what NiFi calls a FlowFile. A FlowFile consists of two parts: 1. FlowFile metadata/attributes - Stored in the NiFi flowfile_repository. It contains details about the content such as filename, size of content, location of stored content, and any other attributes added by NiFi components (processors, controller services, etc) as the FlowFile traverse these components in your dataflow. 2. FlowFile content - Stored in claim files with in the NiFi content_repository. NiFi simple writes the binary contents to a claim and records the starting byte location and number of bytes of the content. This way NiFi does not need to be able to read the content to move it through a dataflow. It becomes an individual component's code responsibility for knowing how to read the content of a FlowFile. So NiFi includes processor components for many different data types. As far as XML content files, NiFi has limited native options (SplitXML, TransformXML, ValidateXML, XMLReader, and XMLRecordSetWriter). The latter two are controller services that could be used by processors like convertRecord. There is also the possibility that one fo NiFi's scripting processor could be used where the user writes a script that can read an handle the specific content type. There are execute processors that can execute and external command on the server where NiFi is running against the content of a FlowFile. So if there is an external command service that can take content input and return modified content back. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-22-2022
05:32 AM
@wasabipeas What version of NiFi-Registry is being used as well? In your NiFi UI, search for component UUID (a8db3982-1350-1b8b-ffff-fffff988699d). What kind/type of component is it? What is current state of the component (enabled, disabled, running, stopped, enabling, disabling, starting, stopping) Share screenshot of its current configuration. Thanks, Matt
... View more
09-22-2022
05:22 AM
@myzard Did your LDAP manager password contain any XML special characters? Did you verify ldapsearch worked from same same host were NiFi is installed using that manager DN and Manager password to get a return for the user you are trying to login in with? what output did you get from ldap search? For the ldap-provider, there are only two username and passwords in use: 1. Manager DN and Manager password configured in the ldap-provider 2. username and password entered at login interface. Other suggestions: - Make sure there are no leading or trailing whitespaces on the username or password configured in the provider or entered at the login windows. - Make sure the nifi.properties file is properly configured for the ldap-provider and not a different login provider like kerberos-provider. - Share you ldap-providers.xml file Thanks, Matt
... View more
09-22-2022
05:13 AM
@skoleg Looks like you may have an issue with your self signed node certificates. Can you share the output of your keystore and truststore from both nodes: keytool -v -list -keystore <keystore filename>
keytool -v -list -keystore <truststore filename> I wonder if perhaps you are missing the required clientAuth ExtendedKeyUsage (EKU). Thanks, Matt
... View more
09-20-2022
11:16 AM
@skoleg Something is not configured the same if you are getting different behavior out of each node. Unfortunately, without seeing your configuration files (nifi.properties, login-identity-providers.xml, authorizers.xml, authorizations.xml, and users.xml) and app-logs/user-logs, it would be difficult to provide additional suggestions on your setup. Make sure your NiFI nodes are authorized to proxy user requests, but i'd expect you to get an exception in the UI if they were not already. "Anonymous" happens with no client/user authentication was successful. Thanks, Matt
... View more
09-20-2022
11:08 AM
@AyanF I would hope that if you are securing a processor such as HandleHTTPRequest processor to control access to this endpoint, that you have also secured your NiFi instance/cluster as well? SSL/TLS is a very deep subject which is a bit much to discuss end-to-end here, but i will give a 10,000 foot view to help get you started. If so, then somewhere along the line a keystore and truststore were generated to secure your NiFi. Keystore files (keystore or truststore) are not something unique to NiFi nor any NiFi processor that uses them. Keystore and truststores are used to facilitate TLS handshakes and establish a secure and encrypted connection between a client and a server. A keystore used by NiFi would contain a single "PrivateKeyEntry". This privateKeyEntry used by HandleHHTPRequest processor must have an Extended Key Usage (EKU) that support at least serverAuth (for securing NiFi itself it would need both serverAuth and clientAuth EKUs). It must also contain a SubjectAlternativeName (SAN) that contains the hostname of server on which your NiFi is running. Commonly the same keystore used to secure your NiFi instance is also used in the SSLContext service to secure components used in your dataflows but not a requirement. A Truststore contains one too many "TrustedCertEntry"s. The trustedCertEntries are the public keys for self-signed certificates, intermediate, and root Certificate Authorities (CA). TLS works the same accessing a secured NiFi or secured NiFi processor as it does accessing any other https site. The server side of the connection decides if the TLS handshake should be 1-way or mutual. In a 1 way TLS connection, the secured server provides its certificate to the client. The client verify that it trusts that server certificate. The server does not require the client to identify itself via a client certificate. This is how https://www.google.com works.The Server has the option to require/need or want a client certificate. The HandleHTTPRequest processor which would be server side of the TLS connection can be configured for one of those options. In a TLS client and server exchange the certificate DN is always used to identify the server to the client and client to the server. The Client certificate presented must be trusted by the server side as well. An truststore is included with every Java distribution. The file is named "cacerts" and it contains a bunch of trustedCertEntrys for well know CAs. You can view the contents of a keystore like the cacerts file using java's keytool command: keytool -v -list -keystore <path to keystore file>/<keystore filename> Naturally if you create your own certificate it would be self signed and the cacerts file would need to have the public cert/key for your self signed private cert/key added to it. Here is a example process for creating a keystore: https://docs.oracle.com/cd/E19509-01/820-3503/ggfen/index.html There are also services (some free) you can use to generate a signed certificate. Those services will also provide you with the public certs for your truststore. https://www.tinycert.org/ But simply having a keystore and truststore on the NiFi (server) side is not enough to establish a secured connection. The client that is connecting to you NiFi handleHTTPRequest endpoint must also have a truststore. That truststore must contain the public cert/key if using a self signed server side cert or the complete trustchain for CAs that signed your server certificate. What is a trustchain? When you get a server cert signed, it may be signed by an intermediate CA or a root CA. An intermediate CA is a CA that itself has been signed by yet another CA. A root CA is signed by itself (Owner and issuer are same DistinquishedName (DN)). The complete trustschain would involve having every public cert from intermediate CA to the rootCA. Let's say your server cert is signed by intCA1 and intCA1 is signed by rootCA. RootCA is signed by RootCA. So the complete trustschain would require having the public cert/key for both intCA1 and rootCA in the truststore. Now if you choose to have your HandleHttpRequest processor to "need authentication", the client must have a certifcate the truststore configured in your SSLContextService is capable of trusting (meaning it needs to complete trustchain in that truststore for that client cert/key. Keep in mind that any client certificate that is trusted by that truststore will be able to interface with that endpoint. I understand this is a lot, but it only scratches the surface. Hopefully it is enough to get you were you need to be to secure your NiFi and NiFi components. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-20-2022
10:16 AM
1 Kudo
@wasabipeas The revision is incremented anytime a change occurs on a component to make sure that all nodes are running the exact same dataflow. Revisions have nothing directly to do with version controlled dataflows. If you were to restart your entire cluster (not a rolling restart, but a shutdown all and start all nodes), component revisions will start over. "for some reason the local flowfile does not reflect the versioned configuration" Are you saying that if you access the NiFi UI from a different node in your 11 node cluster, this process groups renders differently? Screenshots would be helpful in understanding your descriptions. Does the process group indicate it is under version control? Does it report "local changes"? Revision issues can happen when a NiFi node is not running the same version as other nodes in the cluster. Let's say some processor component you are using has a newer version on other nodes and the newer version of the processor introduced a new property. So on some nodes the property exists and on others it does not. I suggest verifying that all nodes ion your cluster are running the same version of NiFi. Additionally compare the contents of the NiFi lib directory(s) to make sure they are the same on all nodes. This includes any custom lib directories or anything you may have added to the extensions directory. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-19-2022
06:04 AM
@skoleg Do you have a network load balancer in front of your NiFi? The user authentication token issued is only good for the NiFi host from which it was issued. So if node 1 returns a client bearer token following successful authentication, and the load-balancer then sends subsequent request to node2, node 2 will not be able to accept that bearer token and return user to the login page. When using a external load-balancer, it is important to make sure sticky sessions are configured so that all redirects after login continue to get sent to same NiFi node. ---------------- If a load-balancer is not in play here, verify the same configuration in both your node's nifi.properties (except hostnames and keystore files), login-identity-provider.xml, and authorizers.xml files. @Sanchari NiFi FlowFiles reside in connection between NiFi component processors. When a processor gets a thread to execute, it takes the highest priority FlowFile from an inbound connection queue and executes the processor code utilizing that FlowFiles metadata/attributes and content (if processor needs content). The FlowFile is not transferred to a processors outbound connection(s) until execution is complete. When NiFi is shutdown gracefully (meaning a user has initiated a shutdown), NiFi stops scheduling future component execution. NiFi then gives existing executing threads a grace period to complete their thread execution. At the end of that grace period, any still running threads are killed with the JVM. Since FlowFiles do not transfer to an outbound connection until code execution has completed, and FlowFile that was owned by a thread at the time the thread was killed still remains on the inbound connection. When NiFi is started again and the dataflows started, the file processing will start over when the processor executes again and executes against the highest priority FlowFile in the connection. Above being said, NiFi will favor data duplication over data loss every time. It is possible in a small window of time that processor executes and part of that execution is let's say to write a file to a remote server. NiFi may for example ack the completion of that transfer to the remote system and NiFi JVM was killed before internally it received ack back from target server. So the FlowFile would end up being processed again resulting potentially data duplication on the target server. These are rare race conditions, but possible. A restart is nothing more than a standard shutdown followed by a start. The same behavior exists in the shutdown process as described above when a restart is performed. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-19-2022
05:57 AM
@Sanchari NiFi FlowFiles reside in connection between NiFi component processors. When a processor gets a thread to execute, it takes the highest priority FlowFile from an inbound connection queue and executes the processor code utilizing that FlowFiles metadata/attributes and content (if processor needs content). The FlowFile is not transferred to a processors outbound connection(s) until execution is complete. When NiFi is shutdown gracefully (meaning a user has initiated a shutdown), NiFi stops scheduling future component execution. NiFi then gives existing executing threads a grace period to complete their thread execution. At the end of that grace period, any still running threads are killed with the JVM. Since FlowFiles do not transfer to an outbound connection until code execution has completed, and FlowFile that was owned by a thread at the time the thread was killed still remains on the inbound connection. When NiFi is started again and the dataflows started, the file processing will start over when the processor executes again and executes against the highest priority FlowFile in the connection. Above being said, NiFi will favor data duplication over data loss every time. It is possible in a small window of time that processor executes and part of that execution is let's say to write a file to a remote server. NiFi may for example ack the completion of that transfer to the remote system and NiFi JVM was killed before internally it received ack back from target server. So the FlowFile would end up being processed again resulting potentially data duplication on the target server. These are rare race conditions, but possible. A restart is nothing more than a standard shutdown followed by a start. The same behavior exists in the shutdown process as described above when a restart is performed. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-16-2022
12:49 PM
@EuGras You have a FlowFile queued somewhere within your dataflow with UUID= 6ce9e262-b20b-4372-a3b9-43c2c00e8caa The connection is trying to read the content for that FlowFile from a content claim found in the content-repository in order to load balance data across nodes in the cluster here: id=1663256223724-231072817, container=default, section=49 <path to>/content_repository/49/1663256223724-231072817 The FlowFile metadata/attributes has recorded that this content should be 2203 bytes in length; however, tis file is only 1130 bytes in size. So it appears when you had disk issue it resulted in data corruption. You could use NiFi data provenance to locate this FlowFile by UUID or filename (04190e1f-fdca-4352-a796-6b6c9ce41baa) to determine which connection contains it. On that connection you could disable load-balance connection configuration, add a routeOnAttribute processor to filter out this one bad FlowFile and auto-terminate it once it is routed out of other FlowFiles that may have been queued in that same connection. This is not to say that you may have other corruption caused by your disk issues besides this one FlowFile. If you do not care about the data on the nodes that had the disk issues, as another option, you could shutdown that one node, purge the contents of the flowfile_repository and content_repository. This will effectively delete all flowfiles queued in connections on that one node. Then restart the NiFi node. It will construct new content and flowfile repository on startup. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more