Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Listing folders with NiFi

avatar
Explorer

I am trying to use Nifi to automatically create external Impala tables if there are new subfolders under a certain directory. For this I have the directory structure
data/<timestamp>/<tables>/*.parquet
I am therefore currently trying to first get a list of folders under data. Then I want to query all subfolders in these folders again, so that I can use them as table names in a SQL query that I send to Impala.
However, the ListFiles processor only returns files, I am not interested in the file names, I only need the folders.
Is there a way to do this with NiFi or is my plan complete nonsense?

1 ACCEPTED SOLUTION

avatar

@DrManu,

I do not think that you will find a processor in NiFi which will extract only the folder name out of your location 😞 You will either have to write your own processor or use a combination of several others, already part of NiFi.

How I would honestly try the mentioned scenario:
- An ExecuteStreamCommand Processor in which you have defined a custom made script which will read your folder structure and generate a JSON File, where each row is basically a complete path to a specific Folder.
- Afterwards, you could use an SplitJson to generate a single FlowFile for each Folder and send it down your stream for further processing.

View solution in original post

2 REPLIES 2

avatar

@DrManu,

I do not think that you will find a processor in NiFi which will extract only the folder name out of your location 😞 You will either have to write your own processor or use a combination of several others, already part of NiFi.

How I would honestly try the mentioned scenario:
- An ExecuteStreamCommand Processor in which you have defined a custom made script which will read your folder structure and generate a JSON File, where each row is basically a complete path to a specific Folder.
- Afterwards, you could use an SplitJson to generate a single FlowFile for each Folder and send it down your stream for further processing.

avatar
Explorer

Thank you very much!

This gives me confidence that my attempts are not going in the wrong direction.I already use ExecuteStreamCommand Processors excessively and it is a pleasure to use them here as well.