Member since
05-16-2021
8
Posts
2
Kudos Received
0
Solutions
06-03-2021
06:32 AM
Hi @MattWho , Thanks for the response. I'll happily create a feature request, though I wasn't sure if I was missing something obvious that would meet my objective/requirements. Thank you for clarifying, I'll go take a look at the links you provided (admittedly I had missed the mailing lists, oops). Kind regards Mark
... View more
05-30-2021
09:56 AM
I hope a bump to the top of the recent post list isn't breaking any rules, but I'm hoping someone might be able to offer an opinion, so a bump it is!
... View more
05-21-2021
02:23 PM
Hi, I've just noticed your date strings in the sample aren't consistent which will make things difficult I think. For example in line 1 your OrderDate format is "M/dd/yyyy" (single digit month) and in ship date it seems to be "MM/dd/yyyy" (double digit month). For the CSV to work correctly all of your date fields would need to adhere to the same format I believe. You can do a couple of things to resolve this: Fix it in the source. The best option in my opinion, if possible. Clean the data in NiFi before it arrives at the PutDatabaseRecord processor. Unfortunately the second is a touch beyond my current level of expertise, but if I was you I would explore either ReplaceText processor with the appropriately regex expression, or a an UpdateRecord processor with a SQL like statement to update this field in the flowfile - note though this would would require a separate CSVReader service that reads your data as strings. Good luck.
... View more
05-21-2021
02:05 PM
Hi, Check your "Date Format" setting in the CSVReader service where you can specify a format string to interpreting text as a date. At a guess either "M/dd/yyyy" or M/dd/yyyy" would work based on the data sample provided. More information on date format strings here: https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html As an aside, I don't think the CSVRecordSetWriter settings are much use here as you're terminating all outbound PutDatabaseRecord relationships. Hope that helps.
... View more
05-21-2021
01:25 PM
Hello, I'd like to implement a lock mechanism in a data flow that prevents a flow file from progressing. Similar in a way to the wait processor, holding a flowfile in a queue while another flowfile with a corresponding signal attribute passes through a 'gate' further down the flow, similar to a notify. In my specific case I load a number of files received together in to target tables. I have a single flow which does the following: PutSql - issues a truncate table command that empties the target staging table. The table name is taken from an attribute. PutDatabaseRecord - Inserts the records into the staging table based on the same attribute. PutSql - executes a stored procedure passing the relevant tables name as parameter from the attribute. This SP merges data from the staging table into a 'main' table. A number of these files, share the same staging and target tables and on occasions I have a race condition where the the staging table is truncated by a second flowfile before the stored procedure has had time to run for the first. There are never more than two flowfiles destined for the same target table currently, so I'm able to mitigate the issue by routing one set to a retry processor and penalising the flowfiles for an arbitrary amount of time to give the first set time to complete their load. This isn't particularly elegant and it doesn't feel like it would scale very well. Also, my situation is likely to change, and I could have more than two common files in the future. To my mind, if I could have a wait processor check a cache when a flowfile passes through it, and if the cache is empty it adds the target table name value to the cache as a signal. After the third step above is complete for this flowfile, it passes through a notify type processor which removes the signal attribute from the cache releasing any flowfiles with a matching attribute from the queue on the earlier wait. I don't think I can get the wait and notify to work in this manner, but I could be wrong? If so, is there another way to achieve this type of functionality> I'm aware of the Processor Group FlowFile Concurrency and Outbound Policy settings, but these would be too restrictive only processing a single flow file at a time. I would like to only process as many flowfiles concurrently as the DB will take, only holdiing the flowfiles which might cause the race condition. I hope that's clear. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache NiFi