Support Questions

Find answers, ask questions, and share your expertise

Fetch Vs Get vs List processors in NiFi

Explorer

Hi,

Pretty new to NiFi and trying to understand the difference between Fetch,Get and List processors.

List - As I understand, creates flow files with only metadata and not the data. This information can be further passed to downstream to read the file contents.

I am pretty confused about Get/Fetch and which one to be used under what situation.

1 ACCEPTED SOLUTION

Cloudera Employee

Hello @Teej 

 

The short answer is that FetchX (FetchFTP for example) is Nifi cluster friendly, while GetX processors are not.

 

There is a common pattern ("List-Fetch") of using a single node to ListX then pass that List to all nodes in the cluster to do parallelized FetchX - the Fetch will be aware that there are multiple nodes and only Fetch each file once. 

 

If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. 

 

You can read more about it here.

View solution in original post

1 REPLY 1

Cloudera Employee

Hello @Teej 

 

The short answer is that FetchX (FetchFTP for example) is Nifi cluster friendly, while GetX processors are not.

 

There is a common pattern ("List-Fetch") of using a single node to ListX then pass that List to all nodes in the cluster to do parallelized FetchX - the Fetch will be aware that there are multiple nodes and only Fetch each file once. 

 

If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. 

 

You can read more about it here.