Member since
07-30-2019
3467
Posts
1641
Kudos Received
1016
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 191 | 05-04-2026 05:20 AM | |
| 450 | 03-23-2026 05:44 AM | |
| 341 | 02-18-2026 09:59 AM | |
| 590 | 01-27-2026 12:46 PM | |
| 1024 | 01-20-2026 05:42 AM |
02-10-2021
10:27 AM
1 Kudo
@bsivalingam83 The ability to "ignore" properties in various NiFi config files was added with the CFM 1.0.1 release. With older CFM versions (1.0.0) you can set a safety valve to overwrite the current set java.arg.13 value with something else: Above simply defines a key=value pair which would simply not be used by NiFi bootstrap. End result, NiFi no longer using G1GC and instead using the default Garbage Collector for your version of Java being used. Hope this helps, Matt
... View more
02-10-2021
09:03 AM
3 Kudos
@Jarinek The process really depends on what update you are trying to make. 1. You can not remove a connection that has queued FlowFiles in it, but you can redirect it to a different target processor with queued data. 2. You can not redirect a connection if the processor it is currently attached to still has a running thread. Stopping a processor does not kill threads, it simply tells the processor to not execute again at the configured run schedule. Existing threads will continue to run until they complete. Until all threads exit, the processor is still in a state of "stopping" even though UI reflect red square for "stopped". 3. You cannot modify a processor if is still has running threads (see note about "stopping" processors above) 4. If you stop the component that is on the receiving side of a connection, any FlowFiles queued on that connection, not tied to any active thread still running on target processor component, will not be processed and remain queued on the connection. You can manual empty a queue through a rest-api call (means data loss), but that is not necessary if you are not deleting the connection. Attempts to perform configuration changes when components still have active threads or are in a running state will result in an exception being thrown and the change not happening. Attempts remove connections that have queued FlowFiles will throw an exception and block removal. Now if all you are trying to do is modify some configuration on a processor, all you need to do is stop the processor, check that it has no active threads, make the config change, and then start the processor again. Not sure wha you are asking with "update the flow ignoring any data in failure or error connection queues". NiFi does not ignore queued FlowFiles. It also not wise to leave connection with queued FlowFiles just sitting around your dataflows. Those old queued FlowFile will prevent removal or content claims that contain that FlowFiles data. Since a content claim can contain the data from 1 to many FlowFiles, this can result in your content repository filling up. NiFi can only remove content claims which have no FlowFiles pointing to them anymore. Here are some useful links: https://nipyapi.readthedocs.io/en/latest/nipyapi-docs/nipyapi.html https://github.com/Chaffelson/nipyapi http://nifi.apache.org/docs/nifi-docs/rest-api/index.html https://community.cloudera.com/t5/Community-Articles/Update-NiFi-Connection-Destination-via-REST-API/ta-p/244211 https://community.cloudera.com/t5/Community-Articles/Change-NiFi-Flow-Using-Rest-API-Part-1/ta-p/244631 Hope this helps, Matt
... View more
02-09-2021
05:52 AM
@medloh That is the correct solution here, the filename is always stored in a FlowFile attribute named "filename". Using the updateAttribute processor is the easiest way to manipulate the FlowFile attribute. You can use other attributes, static text, and even subjectless functions like "now()" or "nextInt()" to create dynamic filenames for each FlowFile. https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html Hope this helps, Matt
... View more
02-09-2021
05:48 AM
@Umakanth The GetSFTP processor actually creates a verbose listing of all Files form target SFTP for which it will be getting. It then fetches all those files. Unlike the ListSFTP processor, the getSFTP is an older deprecated processor that does not store state. My guess here is that at times the listing is larger then other times or as you mentioned some occasional latency occurs resulting in enough time between creating that list and actually consuming the files, that the source system has moved the listed file before it is grabbed. In that case moving to the newer ListSFTP and FetchSFTP processors will help in handling that scenario. The listing will list all the files it sees and the FetchSFTP will fetch the content for those that have not yet been moved by the source system. The FetchSFTP will still throw an exception for each file it can not find still and route those to the not.found relationship which you can handle programmatically in your NiFi dataflow(s). Thanks, Matt
... View more
02-08-2021
08:45 AM
@medloh The Schema only needs to be defined in the RecordReader configured in the PutParquet processor. In the case of the ConvertRecord processor there exists both a Record reader and a Record Writer. You can have the RecordReader get the Schema from the Record Writer or define its own Schema. Hope this helps, Matt
... View more
02-08-2021
08:20 AM
@Jarinek NiFi Variables can only be used by component properties that support NiFi's Expression Language (EL). NiFI Parameters can be used in ANY component property including those that are encrypted. This gives more flexility to users, especially those users who use NiFi-Registry to promote version controlled Process groups across multiple NiFi instances/clusters. It is often the case that different environments have different URLs and passwords in use within the same dataflows. A Dataflwo can thus be promoted to another environment that simply uses different Parameter values thus not requiring the user to update a large number of components each time a new version of flow is promoted from one environment to another. You are correct that Parameters are similar to Variables in respect to assignment to a process group. You can only have one Parameter context assign to a Process Group. Hope this helps, Matt
... View more
02-05-2021
08:09 AM
1 Kudo
@medloh The article you are using for reference is old and bit out of date. As part of the work that went into NIFI-3921 the schema properties within the putParquet processor were removed. Before these changes were made at the time of that article you referenced, you had to set the schema properties and they had to match the schema properties set in the recordReader. With the changes the processor simply gets them from the reader so they do not need to be configured a second time in the processor properties. Also at time of that article there was no parquetReader or ParquetRecordSetWriter controller services. Now that NiFi has a ParquetReader and writer, you can use the ConvertRecord processor to read a source FlowFiles and convert it parquet within your dataflow and then have freedom to use whatever processor you want downstream in your dataflow to write out the parquet content. You can think of the putParquet as a combination of ParquetRecordSetWriter and putHDFS with a selectable recordReader only. Hope this helps, Matt
... View more
02-02-2021
07:12 AM
@BhaveshP I am in complete agreement with @tusharkathpal response. But you should be able to work around this issue through a configuration change in your nifi.properties file. nifi.web.proxy.host=dev.example.com:<port number> Property description: A comma separated list of allowed HTTP Host header values to consider when NiFi is running securely and will be receiving requests to a different host[:port] than it is bound to. For example, when running in a Docker container or behind a proxy (e.g. localhost:18443, proxyhost:443). By default, this value is blank meaning NiFi should only allow requests sent to the host[:port] that NiFi is bound to. Since the hostname your client is using does not match any SAN in the individual nodes certificates, the above property allows NiFi to accept this additional hostname. The other option is to create new certificates for each of your NiFi nodes where "dev.example.com' is added as an additional SAN entry. Hope this helps, Matt
... View more
02-02-2021
06:52 AM
@Umakanth Any chance you are running a NiFi cluster (multiple NiFi nodes) or you have multiple systems all trying to consume the same data from this same SFTP server? It is possible that one host finished reading the file first and removed it before the other hosts could finsih reading the same file. SFTP is not a cluster friendly protocol and if using this processor in a NiFi Cluster, this processor should be configured to execute on "primary node" only. Otherwise all nodes in your cluster will be fighting to consume the same source files of which you can expect to see exceptions. The GetSFTP processor is also a deprecated processor in favor of the newer listSFTP and FetchSFTP set of processors. The newer processors allow you listSFTP (primary node only and produces 0 byte FlowFiles) ---> load balanced connection (balances FlowFiles across all nodes in cluster)--> FetchSFTP ( Execute on all nodes. Retrieves specific content per FlowFile). Hope this helps, Matt
... View more
02-02-2021
06:41 AM
1 Kudo
@Arash In your 4 node NiFi cluster, what value do you have set in the "nifi.remote.input.host" property in the nifi.properties file for each of the 4 nodes? It should be the FQDN for each node and not be the same value on all 4 nodes. Form the host where MiNiFi is running, can all 4 of those FQDNs be resolved and reachable over the network? If not, MiNiFI RPG is only going to be able to send successfully to one FQDN it can reach. When the RPG is started it reaches out to the URL configured in the RPG to obtain S2S details from the target host. That target host collects the host details for all currently connected nodes in the cluster and communicates that back to the client (MiNiFi). If all 4 nodes report the same configured FQDN in the "nifi.remote.input.host" property, then client only knows of one FQDN to which it can send FlowFiles over Site-To-Site (S2S). To improve redundancy in the RPG, you can provide a comma separated list of URLS in the RPG configuration so if any one node is down, the RPG can try fetch S2S details from the next host in the comma separated list. Hope this helps, Matt
... View more