Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to Know in which directory data stores after the process is completed in Nifi

avatar
Rising Star

Hi All,

I created one process executed successfully GetFile operation , How to find in which directory the file is stored.

steps i followed

1.Opened web UI localhost:8080/nifi/

2.Getfile Processor

3.Linked Processor with log attribute with success relation

4.start process

5.File is moved from local file system to Nifi Directory

how to find that one in which folder it was stored.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Typically the GetFile processor it pulls it into NiFi so you can do some type of processing or routing. It doesn't really put the file anywhere in particular.

You should use something like the PutFile processor to move the file to a location of your choosing. Just make sure to route the success relationship to the PutFile processor and configure the PutFile processing to your liking.

GetFile

PutFile

View solution in original post

11 REPLIES 11

avatar
Expert Contributor

Typically the GetFile processor it pulls it into NiFi so you can do some type of processing or routing. It doesn't really put the file anywhere in particular.

You should use something like the PutFile processor to move the file to a location of your choosing. Just make sure to route the success relationship to the PutFile processor and configure the PutFile processing to your liking.

GetFile

PutFile

avatar
Rising Star

Thank you Zblaco

avatar
Super Mentor
@AnjiReddy Anumolu

Just to add a little more detail to the above response from @zblanco.

When NiFi ingest data, that data is turned in to NiFi FlowFiles. A NiFi FlowFile consists of Attributes (Metadata) about the actual data and the physical data. The FlowFile metadata is stored in the FlowFile repository as well as JVM heap memory for faster performance. The FlowFile Attributes includes things like filename, ingest time, lineage age, filesize, what connection the FlowFile currently resides in dataflow, any user defined metadata, or processor added metadata, etc....). The physical bytes that make up the actual data content is written to claims within the NiFi content repository. A claim can contain the bytes for 1 to many ingest data files. For more info on the content repository and how claims work, see the following link:

https://community.hortonworks.com/articles/82308/understanding-how-nifis-content-repository-archivi....

Thanks,

Matt

avatar

Normally The Content Repository holds the content for all the FlowFiles in the system. By default, it is installed in the same root installation directory as all the other repositories; as a admin you can configure it on a separate drive if available. e.g. check your {nifi_install_dir}/content_repository for contents.

avatar
Rising Star

Thank you @milind pandit

avatar

Hi @AnjiReddy Anumolu,

Easy way to get hold of your file is from provenance:

- on NiFi UI, click provenance button on top right corner

5311-screen-shot-2016-06-28-at-72103-am.png

- find the event for your file, click on "view details" button

5312-screen-shot-2016-06-28-at-72333-am.png

- you can view or download the file on the "contents" tab:

5313-screen-shot-2016-06-28-at-72527-am.png

if you need to see the file contents on your server, search in the content_repository for file named as "identifier" from output claim [ie 1467063966583-11 as in screenshot above] @ "offset" [ie 463775 as in screenshot above] .

5314-screen-shot-2016-06-28-at-74030-am.png

Hope this helps!

Thanks!

avatar
Rising Star

Thank you @Jobin George

avatar
Master Guru

Both of the above answers are correct. Just to provide a full picture of what is happening...

GetFile picks up the file from directory and brings it into NiFI's content_repository, which as milind pointed out is by default located under {nifi_install_dir}/content_repository. This directory is not meant to be used by the user, it is for NiFi's internal purposes.

The FlowFile is then transferred to LogAttribute which logs information, and I assume if that is the end of your flow then you must have marked the success relationship on LogAttribute as auto-terminated. At this point the flow file is removed from NiFi and the content in the content repository will eventually be removed.

NiFi is not meant to be a storage system where you bring data in and then leave it there, your flow would have to send the data somewhere after GetFile.

avatar
Rising Star

Thank you @Bryan Bende