Support Questions

Find answers, ask questions, and share your expertise

Nifi UNpack files issue

avatar
Explorer

Hi, 

I am trying to unpack files from location mentioned in SQL database table. 
When I unpack zip file which has format as below 

Test.zip -> container folder -> 1.txt , 2.txt,3.txt

I want to unpack this zip file , get files and put entries in SQL database table for each file.
I am facing issue as , i am getting duplicate entries in database for this 3 files until the flow is not stopped 
I am attaching the sample flow file . 
I am new to NIFI so any suggestion would be great help 

Thanks in advance ,
PC

 

7 REPLIES 7

avatar
Community Manager

@pratschavan Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our NiFi experts @SAMSAL @joseomjr  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Super Collaborator

If you want to avoid duplicates you could hash the content of the files and leverage the DetectDuplicate processor to only insert the unique files into your DB.

avatar
Explorer

I tried this solution but when I only get the duplicate but no flows in non-duplicate

avatar
Master Mentor

@pratschavan 

This sound like a configuration issue with your GetFile processor.  Sounds like you may have it configured so that you are consuming the same file over and over again.

GetFile processor was deprecated in favor of the newer listFie and FetchFile processors.  

Are you running a single standalone instance of NiFi or running a multi-node NiFi cluster?
How is your GetFile processor configured?

Thanks,

Matt

avatar
Explorer

Hi Matt,
Thank you for your reply.
I am using fetchfile processor to get the file from folder location in between of the flow, can you please suggest me setting I might need to look into 

avatar
Master Mentor

@pratschavan 

FetchFile is typically used in conjunction with ListFile so that it only fetches the content for the FlowFile it is passed.  ListFile would only list the file once.

If you are using only the FetchFile processor, I am guessing you configured the "File to Fetch" property with the absolute path to you file.  Using this processor in this way means that it will fetch that same file every time it is scheduled to execute via the processor's "Scheduling" tab configuration.

Can you share screenshots of how you have these two processors configured?

If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Community Manager

@pratschavan Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: