Created 12-28-2023 10:58 PM
Hi,
I am trying to unpack files from location mentioned in SQL database table.
When I unpack zip file which has format as below
Test.zip -> container folder -> 1.txt , 2.txt,3.txt
I want to unpack this zip file , get files and put entries in SQL database table for each file.
I am facing issue as , i am getting duplicate entries in database for this 3 files until the flow is not stopped
I am attaching the sample flow file .
I am new to NIFI so any suggestion would be great help
Thanks in advance ,
PC
Created 12-29-2023 08:47 AM
@pratschavan Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @SAMSAL @joseomjr who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 12-29-2023 08:50 AM
If you want to avoid duplicates you could hash the content of the files and leverage the DetectDuplicate processor to only insert the unique files into your DB.
Created 01-09-2024 11:47 PM
I tried this solution but when I only get the duplicate but no flows in non-duplicate
Created 01-02-2024 06:25 AM
@pratschavan
This sound like a configuration issue with your GetFile processor. Sounds like you may have it configured so that you are consuming the same file over and over again.
GetFile processor was deprecated in favor of the newer listFie and FetchFile processors.
Are you running a single standalone instance of NiFi or running a multi-node NiFi cluster?
How is your GetFile processor configured?
Thanks,
Matt
Created 01-09-2024 11:46 PM
Hi Matt,
Thank you for your reply.
I am using fetchfile processor to get the file from folder location in between of the flow, can you please suggest me setting I might need to look into
Created 01-10-2024 12:57 PM
@pratschavan
FetchFile is typically used in conjunction with ListFile so that it only fetches the content for the FlowFile it is passed. ListFile would only list the file once.
If you are using only the FetchFile processor, I am guessing you configured the "File to Fetch" property with the absolute path to you file. Using this processor in this way means that it will fetch that same file every time it is scheduled to execute via the processor's "Scheduling" tab configuration.
Can you share screenshots of how you have these two processors configured?
If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 01-08-2024 02:55 PM
@pratschavan Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,