Member since
07-30-2019
3471
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 152 | 06-03-2026 06:06 PM | |
| 460 | 05-06-2026 09:16 AM | |
| 830 | 05-04-2026 05:20 AM | |
| 499 | 05-01-2026 10:15 AM | |
| 625 | 03-23-2026 05:44 AM |
09-01-2022
03:22 AM
Thank you Matt sorry for the super delay response but it's very helpful
... View more
08-22-2022
06:36 PM
Correct, it is currently mounted on every node. I would have thought Nifi would dropped the file state once log files are deleted or archived but does not look like it. I could see 15K file states on the tailing processor.
... View more
08-15-2022
07:50 AM
@VJ_0082 Since your log is being generated on a remote server, You will need to use a processor that can remotely connect to the exteranl server to retrieve that log Possible designs: 1. The could incorporate a FetchSFTP processor in to your existing flow. I assume your existing RouteOnAttriibute processor is checking for when an error happens with your script? If so, add the FetchSFTP processor between this processor and your PutEmail processor. Configured the FetchSFTP processor (configured with "Completion Strategy" of DELETE) fetch the specific log file created. This dataflow assumes the log's filename is always the same. 2. This second flow could be built using the ListSFTP (configured with filename filter) --> FetchSFTP --> any processors you want to use to manipulate log --> PutEmail. The ListSFTP processor would be configured to execute on "primary" node and be configured with a "File Filter Regex". When your 5 minute flow runs and if it encounters an exception resulting in the creation of the log file, this listSFTP processor will see that file and list it (0 byte FlowFile). That FlowFile will have all the FlowFile attributes needed for the FetchSFTP processor (configured with "Completion Strategy" of DELETE) to fetch the log which is added to the content of the existing FlowFile. If you do not need to extract from or modify that content, your next processor could just be the PutFile processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
08-10-2022
04:43 PM
Thank you! I looked, but not in the right place. I skimmed over the bucket API, but didn't look in detail as I assumed (incorrectly) that it was not relevant to updating a flow.
... View more
08-10-2022
11:36 AM
@SandyClouds The ExecuteSQL processors do not support SSH tunnel. The expectation by these processors is that the SQL server is listening on a port reachable on the network. SSH tunnels are used to access the server via remotely and then execute a command locally on that SQL utilizing the SQL client on that destination server. The ExecuteSQL processor uses a DBCPConnectionPool to facilitate the connection to the database. The DBCPConnectionPool establishes a pool of connections used by one too many processors sharing this connection to execute their code. A Validation Query is very import to make sure a connection from this pool is still good before being passed to requesting processor for use. While I have not done this myself, I suppose you could set up and SSH tunnel on each NiFi cluster server (example: https://linuxize.com/post/mysql-ssh-tunnel/). Then you could still use the DBCPConnectionPool except use the established tunnel address and port in the database connection URL. Downside to this is that NiFi has not control over that tunnel, so if the tunnel is closed, your dataflow will stop working until the tunnel is re-established. The Validation Query will verify the connection is still good. If it is not, the DBCPConnectionPool will drop it and try to establish a new connection. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
08-10-2022
05:56 AM
1 Kudo
@Nifi- You'll need to provide more detail around your use case in order to get more specific assistance. NiFi offers a number of processor components that can be used to ingest from a database: - ExecuteSQL - ExecuteSQLRecord - CaptureChangeMySQL <-- probably what you are looking for These ExecuteSQL processors will utilize a DBCPConnectionPool controller service for connecting to your specific Database of choice. SQL is what is needs to passed to these processors in order to fetch database table entries. The following processors are often used to generate the SQL in different ways needed by your use case to do this in an incremental fashion (for example: generating new SQL for new entries only so you are not fetching entire table over and over) - GenerateTableFetch - QueryDatabaseTable - ListDatabaseTable The CaptureChangeMySQL processor will output FlowFiles for each individual event. You can then construct a dataflow to write these events to your choice of location. That might be some other database. Once you have your dataflow created for ingesting entries from your table in to NiFi, you'll need to use other processors within your dataflow for any routing or manipulation of that ingested data you may want to do before sending to a processor to write to the desired destination. Possibly using PutDatabaseRecord processor for example? If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
08-08-2022
05:43 AM
1 Kudo
@nk20 I am confused by your concern about in memory state. Can you provide more detail around what you are being told or what you have read that has lead to this concern? Perhaps those concerns are about something more than component state? Perhaps I can address those specific concerns. Not all NiFi components retain state. Those that do either persist that state to disk in a local state directory or write that state to zookeeper. As long as that local disk where state directory is persisted is not lost and the Zookeeper has quorum (min three nodes), then you have your state protected for your NiFi components that write state. Out of all the components (processors, controller services, reporting tasks, etc), there are only about 25 that record state. The only thing that lives in memory only is component status (in, out, read, write, send, received). These are 5 minute stats that live in memory and thus any restart of the NiFi service would set these stats back to 0. These have nothing to do with the FlowFiles or execution of the processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
08-07-2022
11:47 PM
@MattWho @ckumar thanks for your inputs! I was able to resolve the issue following the steps you mentioned. Much appreciated!
... View more
08-04-2022
01:10 PM
@code Have you considered using GenerateTableFetch, QueryDatabaseTable, or QueryDatabaseTableRecord that generates SQL that you then feed to the ExecuteSQL to avoid getting old and new entries with each execution of your existing flow? Avoiding ingesting duplicate entries is better then trying to find duplicate entries across multiple FlowFiles. You can detect duplicates within a single FlowFile using DeduplicateRecord; however, this requires all records are merged in to a single FlowFile. You can use DetectDuplicate; however, this requires that each FlowFile contains one entry to compare. Using these methods add a lot of additional processing in your dataflows or holding of records longer then you want in your flow and this probably not the best/most ideal solution. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
08-04-2022
12:52 PM
@mhsyed The latest Cloudera Runtime version can be found here (latest at top of list): https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/release-guide/topics/cdpdc-runtime-download-information.html So latest version is CDH-7.1.7-1.cdh7.1.7.p1000.24102687 (CDP 7.1.7 Service Pack 1). If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more