Member since
07-30-2019
2859
Posts
1412
Kudos Received
828
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
114 | 03-07-2024 08:25 AM | |
131 | 03-06-2024 09:40 AM | |
111 | 03-04-2024 07:30 AM | |
143 | 02-26-2024 08:27 AM | |
244 | 02-26-2024 07:32 AM |
03-15-2024
09:54 AM
1 Kudo
@Chaitanya_Y I am not sure why the Apache NiFi Community did not release and migration guidance with Apache NiFi 1.25 release. However, there are release notes that highlight notable changes: https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.25.0 You will want to read through all the migration guidance for every release between 1.16 and 1.25 to see if anything applies to your specific setup or dataflows. Take not of any deprecated components you may be using currently and any components that were remove form the default release (removed does not mean gone, you can download those removed nars from central repository and add them to the 1.25.0. release if needed. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-15-2024
06:57 AM
2 Kudos
@Chaitanya_Y Apache NiFi 1.16.0 is 2 years old at this point and there have been many bug fixes since that time specific to parameter contexts. I recommend upgrading to the latest Apache NiFi 1.25.0 release. Some of the issues fixed are related to problems you have shared. Parameter Context fixes since Apache NiFi 1.16.0 release: https://issues.apache.org/jira/browse/NIFI-10096?jql=project%20in%20(NIFI%2C%20NIFIREG)%20AND%20fixVersion%20in%20(1.16.1%2C%201.16.2%2C%201.16.3%2C%201.17.0%2C%201.18.0%2C%201.19.0%2C%201.19.1%2C%201.20.0%2C%201.21.0%2C%201.22.0%2C%201.23.0%2C%201.23.1%2C%201.23.2%2C%201.24.0%2C%201.25.0%2C%201.26.0)%20AND%20text%20~%20%22%5C%22parameter%20context%5C%22%22%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC%2C%20updated%20DESC If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-12-2024
06:22 AM
1 Kudo
@broobalaji HDF 1.8.0.3.3.1.0-10 was released way back in 2017. I strongly recommend upgrading to a much newer release of CFM. NiFi Templates have been deprecated and are completely removed as of Apache NiFi 2.x releases. Apache NiFi deprecated templates for a number of reasons: 1. Templates uploaded to NiFi (even if not instantiated/imported to the NiFi canvas reside within NiFi's heap memory space) 2. Large uploaded templates or many uploaded templates can have a substantial impact on NiFi performance because of the amount of heap they can consume. Simply increasing the size of NiFi's heap is also not the best solution to that heap usage as large heaps just lend themselves to longer stop-the-world garbage collections with the JVM. 3. Apache NiFi deprecated and moved away from using xml based flow in favor of json flow definitions around the Apache NiFi 1.16 time frame. Flow definitions (JSON files) can exported and imported without uploading them in to heap memory within NiFi. The above info aside.... It is best to use the developer tools available in your web browser to inspect/capture the rest-api call being made when you perform the same steps directly via the NiFi UI. This makes it easy to understand the calls that need to be made in your automation. I also encourage you if you continue to use templates to upload, import to UI, and then delete the uploaded template to minimize heap impact. Thanks, Matt
... View more
03-11-2024
08:56 AM
@whoknows Providing an actual CVE for the suspected detected vulnerability it always going to get you the bets response. I am assuming you may be referring to this CVE? https://www.cvedetails.com/cve/CVE-2024-22233/ Apache NiFi is not vulnerable to this CVE because NiFi does not use Spring MVC, it uses JAX-RS and Jersey for REST resources. The vulnerability is only exposed when all of the following are true: The application uses Spring MVC * Spring Security 6.1.6+ or 6.2.1+ is on the classpath ----------------- As far as upgrading directly from Apache NIFi 1.19.1 to 1.25 goes, you should have no issues there provided you have reviewed the release notes below for all version from 1.20 to 1.25 to see if any changes may impact your specific dataflows: https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.25.0 I saw no red flags to worry about. ----------------- Apache NiFi also upgraded its Spring Framework version in https://issues.apache.org/jira/browse/NIFI-12811 in Apache NiFi 2.0. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-11-2024
08:39 AM
@Ghilani 1. You are getting same exact Invalid SNI exception? 2. You are using the keystore and truststore built by Apache NiFi out-of-the-box? 3. You tried using "localhost" if Nifi is on same host and browser being used to access it? 4. If browser is on different host then NiFi, did you use the hostname instead of IP address for target host where NiFi is running? 5. Did you list the keystore used by your running NiFi to inspect the SAN entries it has setup? Thanks, Matt
... View more
03-11-2024
07:41 AM
1 Kudo
@TreantProtector There is a lot of ask in this one post. 1. NiFi Registry is used to store NiFi version controlled NiFi process groups (This takes user manual action to both initiate version control and push new versions to NiFi-Registry. It does not store the flow.xml.gz or flow.json.gz files that contains all the flow information NiFi loads on startup. So it is not a substitute for protecting those files on NiFi. All nodes in a NIFi cluster use the same flow.xml.gz/flow.json.gz, so it is not necessary to preserve the files from every node for recovery. 2a (NiFi) Apache NiFi stores the complete dataflow(s) on your canvas in the flow.xml.gz (legacy format) and flow.json.gz (current format). Preserving this file will preserve all your dataflows on the canvas (NOTE: all sensitive properties like passwords are encrypted in these files using the configures sensitive.props.key in NiFi, so make sure you save that password or you will need to scrub these files of all enc{...} values to load it. removing values woudl require you to re-enter all encrypted values in the NiFi components) Apache NiFi has a local state directory configured. This is unique to each node and stores state information for processors that store local state. Should be preserved to avoid data duplication. Apache NiFi content_repository(s) - Holds active (content claims still used by actively queued FlowFiles within your dataflows) and archived content claims (archive subdirectories holding archived claims which are not being referenced by any active FlowFiles in the UI). This repository is tightly coupled to the flowfile_repository. Content_repository(s) hold unique per node claims and need to be protected on all nodes to avoid data loss. Apache NiFi flowfile_repository - Contains metadata/attributes (to include reference to content claim in content_repository(s) along with byte offset and length). Tightly coupled to content_repository(s) on same node so make sure same flowfile_repository is loaded with corresponding content_repository(s) from same node. This must be protected to avoid data loss. Apache provenance_repository - Holds event data about FlowFile transactions and are unique per node. Loss of these is a loss or provenance history, but would not cause loss of any queued FlowFiles. These are typicallly also placed on protected storage Apache metadata_repository - Metadata about users who authenticated to NiFi and flow configuration history when using embedded H2 DB. Not necessary to retain unless you want to preserve that historical information. NiFi extension directory contains any custom NiFi nars to have added to your NiFi. Copies of yoru custom nars should be preserved somewhere to prevent losing them to they can restored easily should it be needed. Apache NiFi local authorization files like users.xml and authorizations.xml which contain the users and their associated authorizations granted over time through the NiFi UI should be preserved or you'll need to set those back up again in recovery (same on all nodes) Node specific configured local directories used in your dataflows (dataflows built on canvas). Some components may allow you configure local directories for persistent directory storage. If you are using these they should be persisted. Example: DistributedMapCacheServer 1.25.0 2b. NiFi-Registry NiFi-Registry database which contains all information about version controlled flows and buckets should be protected unless you are using an external DB which you are protecting by other means. default uses an embedded H2 DB. NiFi-Registry extensions directory if being used to store version controlled extensions (jars) NiFi-Registry persistence provider stores the actual version controlled NiFi process groups and is tightly coupled to the NiFi-Registry database. If using external GitFlowPersistence provider, refer to git for for persistence requirements. NiFi-Registry bundle persistence has local and S3 options and protected storage should be used if using local NiFi-Regsitry local authorization files like users.xml and authorizations.xml which contain the users and their associated authorizations granted over time through the NiFi-Registry UI should be preserved or you'll need to set those back up again in recovery. Reference material: https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#backup-recovery 3. covered in above - refer to Apache NiFi nifi.properties file for your configured local storage paths. 4. yes - covered above 5a. Not sure I follow the question. On restoration NiFi or NiFi will read the persistence provider (whether they are local, git, or S3) preserving the NiFi and NiFi-Registry conf directory configuration files would make restoration easier. While the NiFi content_repository(s) and flowfile_repository are tightly coupled to one another on the same node and tie back to the flow.xml.gz/flow.json.gz (same all nodes) content. which node they get restored to does not matter (specific node information is not present in any of those). NOTE: content_repositories are directly correlated to the content_repository property name in the nifi.properties file. nifi.content.repository.directory.default=/dir1/node1 nifi.content.repository.directory.repo2=/dir2/node1 Upon restoration content_repository contents persisted for /dir1/node1 must still be set in "defualt" and not set to different property name. This is because the flowfile metadata in the corresponding flowfile_repository does not contain directory details. It simply says you can find content for FlowFile xyz in nifi.content.repository.directory.default at sub-directory (num), content claim, byte offset, and num bytes. So if you put dir2 in the default content_repository you'll mess up finding your content. 6. Zookeeper is used to store cluster state used by a good number of NiFi processors (refer to individual processor documentation for state information. For every processor documentation. there is a "state management" section that tells you if the specific processor component stores state and if that state is local or cluster). State is stored for a specifc component For cluster state stored in zookeeper it is not node specific state as all components that use cluster state utilize same state information. Failing to protect against loss of state info typically leads to data duplication, but all depends on how a given processor is using that state information. Example: ListSFTP 1.25.0. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-07-2024
08:25 AM
@sukanta This depends on what version of Apache NiFi is being used. In Apache NiFi 1.12 or newer, there exists the following property in the nifi.properties file for excluding the server version in HTTP responses: nifi.web.should.send.server.version=<true or false> The default is true when not configured. This capability was added as part of https://issues.apache.org/jira/browse/NIFI-7321 It is best to ask unrelated question in different community questions. Asking multiple question makes it hard for other in the community to understand what question was addressed by the "accepted" solution. Keep in mind that Apache NiFi is an open source product, so anyone can look at the source code to see what Jetty version(s) are being used. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-06-2024
09:40 AM
1 Kudo
@hegdemahendra The Service Unavailable response to a request received by the HandleHTTPRequest processor is most commonly the result of the FlowFile produced by the HandleHTTPRequest processor not being processed by a downstream HandleHTTPResponse processor before the configured Response Expiration configured in the StandardHTTPContextMap controller service. This aligns with the shared exception you shared from the NiFi logs. If you are seeing this exception prior to the 3 minutes expiration you set, it is possible your client is closing the connection due to some client timeout. That you could need to look at your client sending the requests to get details and options. You mentioned you have 4200+ processors that are scheduled based on their individual configurations. When a processor is scheduled it requests a thread from the configured Maximum TimerDriven Thread Count pool of threads. So you can see that not all processor can execute concurrently which is expected. You also have only 8 cores so assuming hyper-threading you are looking at the ability to actually service only 16 thread concurrently. So what you have happing is time slicing where all your up to 200 concurrently scheduled threads are gets bits of time on the CPU cores. Good to see you looked at your core load average which is very important as it helps you determine what is a workable size for your thread pool. If you have a lot of cpu intensive processor executing often, your CPU load average is going to be high. For you I see a good managed CPU usage with some occasional spikes. I brought up above as it directly relates to your processor scheduling. The HandleHTTPRequest processor creates a web server that accepts inbound requests. These request will stack-up within that web service as the processor executed threads read those and produce a FlowFile for each request. How fast this can happen depends on available threads and concurrent task configuration on HandleHTTPRequest processor scheduling tab. By default an added processor only has 1 concurrent task configured. If you set this to say 5, then the processor could potentially get allocated up to 5 threads to process request received by the HandleHTTPRequest processor. Thought here is you might also be seeing service unavailable because the container queue is filling faster then the processor is producing the FlowFiles as another possibility. Hope this information helps you in your investigation and solution for you issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-05-2024
08:46 AM
Correct. A FlowFile might over its dataflow lifetime point at different content claims for its content. That all depends on the processors used in the dataflow. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-04-2024
08:34 AM
1 Kudo
@Chetan_mn NiFi's DistributedMapCacheServer controller service has existed in Apache NiFi since the 0.x releases when Apache NiFi offered no HA at all. In the Apache NiFi 0.x releases NiFi had a dedicated NiFi Cluster Manager (NCM) and NiFi cluster nodes that all reported to the NCM. The only way to access the NiFi UI was via the NCM server. The DistributedMapCacheServer controller service at that time only ran on the NCM and not on any of the cluster nodes. The DistributedMapCacheClient controller service ran on all the nodes so that all cluster nodes could read and write to the same cache server. Fast forward to Apache NiFi 1.x+ releases where Apache NiFi eliminated the NCM with a new zero master clustering ability. This provide HA at the NiFI control layer so that users could access the NiFi cluster via ANY connected node's URL. Since there was not dedicated NCM anymore controller services like DistributedMapCacheServer when added now start a DistributedMapCache server on each node independently of one another. The multiple cache servers do NOT communicate or share cache entries with one another. So you effectively have a single point of failure with this cache provider. If the node which your DistibributedMapCacheClient is configured for goes down, you have an outage until it is recovered. Apache NiFi offers better options that do offer a true distributedMapCache capability now like HBase, Redis, Hazelcast, Redis, etc. These utilize an externally installed map cache service that would offer better fault tolerance through HA, but adds an additional service dependency. Now if you still choose to use the DistributedMapCacheServer, keep in mind that all cached entries will be held in NiFi's heap memory. So the large the cache entry is and the larger the number of cache entries held, the more NiFi heap that will be consumed. The DistributedMapCacheServer has an optional configuration for "Persistence Directory". When configured, the cache entries will be persisted to disk on the location configured. The amount of space required again depends on cache entry size and number of possible entries to retain. Keep in mind that configuring this persistence directory does NOT remove cache entries from NiFi's heap memory. It simply persist a copy of the cache entries to disk so that should the NiFi be restarted, NiFi can reload the cache from disk in to heap memory. If no persistence directory is configured, NiFi going down would result in a loss of all cache entries. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more