Support Questions

Find answers, ask questions, and share your expertise

How to Know in which directory data stores after the process is completed in Nifi

avatar
Rising Star

Hi All,

I created one process executed successfully GetFile operation , How to find in which directory the file is stored.

steps i followed

1.Opened web UI localhost:8080/nifi/

2.Getfile Processor

3.Linked Processor with log attribute with success relation

4.start process

5.File is moved from local file system to Nifi Directory

how to find that one in which folder it was stored.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Typically the GetFile processor it pulls it into NiFi so you can do some type of processing or routing. It doesn't really put the file anywhere in particular.

You should use something like the PutFile processor to move the file to a location of your choosing. Just make sure to route the success relationship to the PutFile processor and configure the PutFile processing to your liking.

GetFile

PutFile

View solution in original post

11 REPLIES 11

avatar

thank you @Bryan Bende

avatar
Master Mentor

@AnjiReddy Anumolu

Let me start off by making sure I fully understand the dataflow you have created to better answer your question. You have added a getFile processor to your flow which will pickup file(s) from a local file system directory and then sends them via the success relationship to a logAttribute processor.

What did you do with the logAttributes's success relationship?

If it is auto-terminated, you are essentially telling NiFi you are done with the files following a successful logging of the file(s) FlowFile attributes/metadata. If the success relationship has not been defined the processor will remain invalid and cannot be run. In this case the file(s) picked up by the getFile processor will remain queued on the connection between the getFile processor and the logAttribute processor.

In either case, when NiFi ingests file(s) they are placed in the NiFi content repository. The location of the content repository is defined/configured in the nifi.properties file. The default places them in a directory created within the default NiFi installation directory:

nifi.content.repository.directory.default=./content_repository 

NiFi stores file(s) in what are known as claims to make most efficient use of the system's hard disks. A claim can contain 1 to many files. The default claim configuration is also defined/configured in the nifi.properties file. The default configuration is as follows:

nifi.content.claim.max.appendable.size=10 MB 
nifi.content.claim.max.flow.files=100

For files smaller then 10 MB they may be stored with other files with up to 100 total files in a single claim. If a file is larger then 10 MB it will end up in a claim of one. At the same time files are written to a claim, FlowFile attributes/metadata is written about the ingested files in the flowfile repository. The location of the flowfile repository is also defined/configured in the nifi.properties file:

nifi.flowfile.repository.directory=./flowfile_repository 

These FlowFile attributes/metadata will contain information such as filename, filesize, location of claim in content repository, claim offset, etc... The claim offset is the starting byte location of a particular file's content within a claim. The fileSize defines the number of bytes from that offset that makes up the compete data.

The nifi-app.log contains fairly robust logging by default (configured in logback.xml file). When NiFi ingest files, NiFi will log that and that log line will contain information about the claim (location and offset). When NiFi auto-terminates FlowFiles they are removed from the content repository. Depending on the content repository archive setup, the file(s) may be archived for a period of time. In the case of archived file(s), it can be replayed using the provenance NiFi UI.

Thanks,

Matt