Member since
07-30-2019
3470
Posts
1641
Kudos Received
1018
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 249 | 05-06-2026 09:16 AM | |
| 437 | 05-04-2026 05:20 AM | |
| 315 | 05-01-2026 10:15 AM | |
| 506 | 03-23-2026 05:44 AM | |
| 385 | 02-18-2026 09:59 AM |
03-11-2024
08:39 AM
@Ghilani 1. You are getting same exact Invalid SNI exception? 2. You are using the keystore and truststore built by Apache NiFi out-of-the-box? 3. You tried using "localhost" if Nifi is on same host and browser being used to access it? 4. If browser is on different host then NiFi, did you use the hostname instead of IP address for target host where NiFi is running? 5. Did you list the keystore used by your running NiFi to inspect the SAN entries it has setup? Thanks, Matt
... View more
03-11-2024
07:41 AM
1 Kudo
@TreantProtector There is a lot of ask in this one post. 1. NiFi Registry is used to store NiFi version controlled NiFi process groups (This takes user manual action to both initiate version control and push new versions to NiFi-Registry. It does not store the flow.xml.gz or flow.json.gz files that contains all the flow information NiFi loads on startup. So it is not a substitute for protecting those files on NiFi. All nodes in a NIFi cluster use the same flow.xml.gz/flow.json.gz, so it is not necessary to preserve the files from every node for recovery. 2a (NiFi) Apache NiFi stores the complete dataflow(s) on your canvas in the flow.xml.gz (legacy format) and flow.json.gz (current format). Preserving this file will preserve all your dataflows on the canvas (NOTE: all sensitive properties like passwords are encrypted in these files using the configures sensitive.props.key in NiFi, so make sure you save that password or you will need to scrub these files of all enc{...} values to load it. removing values woudl require you to re-enter all encrypted values in the NiFi components) Apache NiFi has a local state directory configured. This is unique to each node and stores state information for processors that store local state. Should be preserved to avoid data duplication. Apache NiFi content_repository(s) - Holds active (content claims still used by actively queued FlowFiles within your dataflows) and archived content claims (archive subdirectories holding archived claims which are not being referenced by any active FlowFiles in the UI). This repository is tightly coupled to the flowfile_repository. Content_repository(s) hold unique per node claims and need to be protected on all nodes to avoid data loss. Apache NiFi flowfile_repository - Contains metadata/attributes (to include reference to content claim in content_repository(s) along with byte offset and length). Tightly coupled to content_repository(s) on same node so make sure same flowfile_repository is loaded with corresponding content_repository(s) from same node. This must be protected to avoid data loss. Apache provenance_repository - Holds event data about FlowFile transactions and are unique per node. Loss of these is a loss or provenance history, but would not cause loss of any queued FlowFiles. These are typicallly also placed on protected storage Apache metadata_repository - Metadata about users who authenticated to NiFi and flow configuration history when using embedded H2 DB. Not necessary to retain unless you want to preserve that historical information. NiFi extension directory contains any custom NiFi nars to have added to your NiFi. Copies of yoru custom nars should be preserved somewhere to prevent losing them to they can restored easily should it be needed. Apache NiFi local authorization files like users.xml and authorizations.xml which contain the users and their associated authorizations granted over time through the NiFi UI should be preserved or you'll need to set those back up again in recovery (same on all nodes) Node specific configured local directories used in your dataflows (dataflows built on canvas). Some components may allow you configure local directories for persistent directory storage. If you are using these they should be persisted. Example: DistributedMapCacheServer 1.25.0 2b. NiFi-Registry NiFi-Registry database which contains all information about version controlled flows and buckets should be protected unless you are using an external DB which you are protecting by other means. default uses an embedded H2 DB. NiFi-Registry extensions directory if being used to store version controlled extensions (jars) NiFi-Registry persistence provider stores the actual version controlled NiFi process groups and is tightly coupled to the NiFi-Registry database. If using external GitFlowPersistence provider, refer to git for for persistence requirements. NiFi-Registry bundle persistence has local and S3 options and protected storage should be used if using local NiFi-Regsitry local authorization files like users.xml and authorizations.xml which contain the users and their associated authorizations granted over time through the NiFi-Registry UI should be preserved or you'll need to set those back up again in recovery. Reference material: https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#backup-recovery 3. covered in above - refer to Apache NiFi nifi.properties file for your configured local storage paths. 4. yes - covered above 5a. Not sure I follow the question. On restoration NiFi or NiFi will read the persistence provider (whether they are local, git, or S3) preserving the NiFi and NiFi-Registry conf directory configuration files would make restoration easier. While the NiFi content_repository(s) and flowfile_repository are tightly coupled to one another on the same node and tie back to the flow.xml.gz/flow.json.gz (same all nodes) content. which node they get restored to does not matter (specific node information is not present in any of those). NOTE: content_repositories are directly correlated to the content_repository property name in the nifi.properties file. nifi.content.repository.directory.default=/dir1/node1 nifi.content.repository.directory.repo2=/dir2/node1 Upon restoration content_repository contents persisted for /dir1/node1 must still be set in "defualt" and not set to different property name. This is because the flowfile metadata in the corresponding flowfile_repository does not contain directory details. It simply says you can find content for FlowFile xyz in nifi.content.repository.directory.default at sub-directory (num), content claim, byte offset, and num bytes. So if you put dir2 in the default content_repository you'll mess up finding your content. 6. Zookeeper is used to store cluster state used by a good number of NiFi processors (refer to individual processor documentation for state information. For every processor documentation. there is a "state management" section that tells you if the specific processor component stores state and if that state is local or cluster). State is stored for a specifc component For cluster state stored in zookeeper it is not node specific state as all components that use cluster state utilize same state information. Failing to protect against loss of state info typically leads to data duplication, but all depends on how a given processor is using that state information. Example: ListSFTP 1.25.0. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-07-2024
08:25 AM
@sukanta This depends on what version of Apache NiFi is being used. In Apache NiFi 1.12 or newer, there exists the following property in the nifi.properties file for excluding the server version in HTTP responses: nifi.web.should.send.server.version=<true or false> The default is true when not configured. This capability was added as part of https://issues.apache.org/jira/browse/NIFI-7321 It is best to ask unrelated question in different community questions. Asking multiple question makes it hard for other in the community to understand what question was addressed by the "accepted" solution. Keep in mind that Apache NiFi is an open source product, so anyone can look at the source code to see what Jetty version(s) are being used. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-06-2024
09:40 AM
1 Kudo
@hegdemahendra The Service Unavailable response to a request received by the HandleHTTPRequest processor is most commonly the result of the FlowFile produced by the HandleHTTPRequest processor not being processed by a downstream HandleHTTPResponse processor before the configured Response Expiration configured in the StandardHTTPContextMap controller service. This aligns with the shared exception you shared from the NiFi logs. If you are seeing this exception prior to the 3 minutes expiration you set, it is possible your client is closing the connection due to some client timeout. That you could need to look at your client sending the requests to get details and options. You mentioned you have 4200+ processors that are scheduled based on their individual configurations. When a processor is scheduled it requests a thread from the configured Maximum TimerDriven Thread Count pool of threads. So you can see that not all processor can execute concurrently which is expected. You also have only 8 cores so assuming hyper-threading you are looking at the ability to actually service only 16 thread concurrently. So what you have happing is time slicing where all your up to 200 concurrently scheduled threads are gets bits of time on the CPU cores. Good to see you looked at your core load average which is very important as it helps you determine what is a workable size for your thread pool. If you have a lot of cpu intensive processor executing often, your CPU load average is going to be high. For you I see a good managed CPU usage with some occasional spikes. I brought up above as it directly relates to your processor scheduling. The HandleHTTPRequest processor creates a web server that accepts inbound requests. These request will stack-up within that web service as the processor executed threads read those and produce a FlowFile for each request. How fast this can happen depends on available threads and concurrent task configuration on HandleHTTPRequest processor scheduling tab. By default an added processor only has 1 concurrent task configured. If you set this to say 5, then the processor could potentially get allocated up to 5 threads to process request received by the HandleHTTPRequest processor. Thought here is you might also be seeing service unavailable because the container queue is filling faster then the processor is producing the FlowFiles as another possibility. Hope this information helps you in your investigation and solution for you issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-04-2024
08:34 AM
1 Kudo
@Chetan_mn NiFi's DistributedMapCacheServer controller service has existed in Apache NiFi since the 0.x releases when Apache NiFi offered no HA at all. In the Apache NiFi 0.x releases NiFi had a dedicated NiFi Cluster Manager (NCM) and NiFi cluster nodes that all reported to the NCM. The only way to access the NiFi UI was via the NCM server. The DistributedMapCacheServer controller service at that time only ran on the NCM and not on any of the cluster nodes. The DistributedMapCacheClient controller service ran on all the nodes so that all cluster nodes could read and write to the same cache server. Fast forward to Apache NiFi 1.x+ releases where Apache NiFi eliminated the NCM with a new zero master clustering ability. This provide HA at the NiFI control layer so that users could access the NiFi cluster via ANY connected node's URL. Since there was not dedicated NCM anymore controller services like DistributedMapCacheServer when added now start a DistributedMapCache server on each node independently of one another. The multiple cache servers do NOT communicate or share cache entries with one another. So you effectively have a single point of failure with this cache provider. If the node which your DistibributedMapCacheClient is configured for goes down, you have an outage until it is recovered. Apache NiFi offers better options that do offer a true distributedMapCache capability now like HBase, Redis, Hazelcast, Redis, etc. These utilize an externally installed map cache service that would offer better fault tolerance through HA, but adds an additional service dependency. Now if you still choose to use the DistributedMapCacheServer, keep in mind that all cached entries will be held in NiFi's heap memory. So the large the cache entry is and the larger the number of cache entries held, the more NiFi heap that will be consumed. The DistributedMapCacheServer has an optional configuration for "Persistence Directory". When configured, the cache entries will be persisted to disk on the location configured. The amount of space required again depends on cache entry size and number of possible entries to retain. Keep in mind that configuring this persistence directory does NOT remove cache entries from NiFi's heap memory. It simply persist a copy of the cache entries to disk so that should the NiFi be restarted, NiFi can reload the cache from disk in to heap memory. If no persistence directory is configured, NiFi going down would result in a loss of all cache entries. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-04-2024
07:30 AM
@MvZ The "file-login-provider" login identity-provider has never existed in any out-of-the-box release of Apache NiFi. If you have created or downloaded some custom implementation of this provider. You would need to consult with that author in getting it to work. Where did you obtain this provider from and what process did you follow to add it to your NiFi installation? The exception you have shared simply tells you that during startup NiFi is loading the nifi.properties file and the property "nifi.security.user.login.identity.provider" is configured with "file-login-provider"; however, when NiFi parsed the login-identity-providers.xml configuration file, no provider with: <identifier>file-login-provider</identifier> was found in that configuration file. I can't provide any guidance on this provider as I was unable to find anything online about what I am expecting is a custom add-on provider. The out-of-the-box available authentication providers are found in the NiFi documentation here: Apache NiFi 1.2x versions: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication Apache NiFi 2.x versions: https://nifi.apache.org/documentation/nifi-2.0.0-M1/html/administration-guide.html#user_authentication NiFi Authentication and Authorization are two different configurations and independent configurations. Once you have chosen how you want to handle user authentication, you then move on to setting up user authorization: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization. For file based authorization, NiFi offers two providers: 1. Older deprecated FileAuthorizer 2. The current StandardManagedAuthorizer These providers are configured in the NiFi authorizers.xml file. No direct useer policies get defined in the authorizers.xml file. The FileAuthorizer or the FileAccessPolicyProvider referenced by the StandardManagedAuthorizer will generate the initial authorizations.xml file with the initial admin user configured in the provider chosen. You would not typically manually generate or manipulate this file. Instead you would acces your NiFi's UI using that initial admin and define additional user authorizations directly via the NiFi UI. Here is an example of what you would have in your authorizers.xml if using the StandardManagedAuthorizer: <authorizers>
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"></property>
<property name="Initial User Identity 1">ronald</property>
</userGroupProvider>
<accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">file-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity">ronald</property>
<property name="Legacy Authorized Users File"></property>
<property name="Node Identity 1"></property>
</accessPolicyProvider>
<authorizer>
<identifier>managed-authorizer</identifier>
<class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
<property name="Access Policy Provider">file-access-policy-provider</property>
</authorizer>
</authorizers> If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-04-2024
05:49 AM
1 Kudo
@saquibsk Unfortunately, the exception "java.lang.ClassCastException: null" is not very helpful here making it very difficult to make any suggestions on where the issue within the data resides. You might want to try putting the putDataBaseRecord processor logging in DEBUG within NiFi's logback.xml to see if it happens to produce more output that might be useful. org.apache.nifi.processors.standard.PutDatabaseRecord It is also a good idea to provide the exact version of Apache NIFi or CFM you are using as it is also useful when asking about issue in the community. It allows those assisting to narrow down the scope of where to look for known issues. Thanks, Matt
... View more
02-26-2024
08:27 AM
3 Kudos
@krishna123 @jameswookyz @rafy NiFi processor are configured with a Run Schedule, by default processors are configured with a Run Schedule of 0 secs. This tells NiFi core to schedule this processor to execute as often as possible. The Scheduling part of the processor handles checking if any of the inbound connections to the processor with queued data or last execution resulted in data. If there is no inbound queued FlowFiles, the NiFi controller will yield the processor scheduling. This yielding is designed to prevent the processor from just constantly trying to schedule when there is no work to do. If there is work to do, the processor will get scheduled to execute. The scheduling typically consumes microseconds of CPU time. And the built-in yielding prevents excessive cpu usage when no work exists to execute upon. Adjusting the run schedule does not change behavior of yielding, but when flow is constant for periods of time, changing the run schedule alters the throughput performance. Hope this clarifies things. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-26-2024
08:14 AM
@ShyamKumar Your dataflow design is still unclear here. - 'We have different client which will call the same generic PG" --> How is this being done? When you say "clients", are you referring to external to NiFi clients? How are these client request being sent to/received by NiFi's PG? We would need to understand your dataflow better before being able to provide better feedback. A detailed use case would be very helpful here. Thanks, Matt
... View more
02-26-2024
08:08 AM
@sukanta This output is normal and expected. The SubjectAlternativeName (SAN) details are available through the public key exchanged in the TLS exchange. The TLS spec verifies the hostname against the SAN in the serverAuth certificate to protect the client from man in the middle type attacks. This information is returned to the client so that you are aware they you tried to access server XYZ, but server identified by these un-matching SANs responded. There is nothing that can be changed on server side (NiFi) to change this behavior. As a side note: You can use openssl to any HTTPS enabled server and get that server's serverAuth public certificate. You can then also use openssl to view the contents of that public cert which will include the SAN info. So I am really not seeing any "sensitive" information in what you have shared. If you concern is about IP addresses... Stop using them and use DNS resolvable hostnames in your SAN. Have a private and public resolvable hostnames in the SAN. If the ask here is not about hiding this public info, but rather allowing the client to continue to connect ignoring the possible man-in-the-middle attack, that is a different question. It would be a question to ask your client provider (assuming your browser in this case?). If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more