Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Executing SQL Script to Process Files in NiFi

avatar
Explorer

I'm working on a data processing flow using Apache NiFi, and I have a scenario that I'd like some guidance on.

  1. I have an Execute SQL script that retrieves data from a table in a SQL Server. Among the data returned, I have source and target folder paths.

  2. My goal is to loop through each of these source folder paths. For each folder, if there are any files present, I need to:

    • Insert a record into the database to log the file's details.
    • Move the file from the source folder to the corresponding target folder.

I'm currently using NiFi, and I'm wondering what processors and strategies I should consider to achieve this workflow efficiently. Any insights, recommendations, or examples of similar workflows would be greatly appreciated.

Thank you in advance for your help!

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
6 REPLIES 6

avatar
Community Manager

@RRG, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @SAMSAL @ckumar @MattWho @steven-matison  who may be able to assist you further.

Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

@SAMSAL  @VidyaSargur  
When using the ExtractText processor, I'm encountering an issue where it returns matched data with file names like 'README,' 'NOTICE,' and 'LICENSE.'

I've noticed that even if there are no files in the source folder, running the package results in records being inserted into the database with these 'README,' 'NOTICE,' and 'LICENSE' file names.

Is there a way to prevent this behavior, so that only actual files in the source folder are processed and inserted into the database?

avatar

@RRG,

In the ExtractText you can use the following pattern to get only valid files:

 

(^((?!README|LICENSE|NOTICE).)*$)

 

This pattern will exclude any text containing readme, license or notice in it.

Hope that helps.

 

avatar
Explorer

Thank you!

If I can receive the real file name that includes this text?

avatar

@RRG,

Not sure I understand. Can you please elaborate?