Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Apache Nifi: ListAzureBlobStorage doesn't take all the files in Blob (only a few)
Labels:
- Labels:
-
Apache NiFi
Contributor
Created ‎04-09-2021 01:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ApacheNifi,
I have implemented this flow in Nifi.
There are two types of files in the blobstorage (a and b).
The ListAzureBlobStorage processor finds only the b files (which are few) on restart.
To solve the problem I always have to delete the processors (List and RouteOnAttribute) and create a new ones with the same old settings. Why does this happen? does anyone have the same problem?
Thank you so much
1 REPLY 1
Expert Contributor
Created ‎04-09-2021 09:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
According to the documentation related to the state management, it will only pull the new files compared to last run
State management:
Scope Description
CLUSTER | After performing a listing of blobs, the timestamp of the newest blob is stored. This allows the Processor to list only blobs that have been added or modified after this date the next time that the Processor is run. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node can pick up where the previous node left off, without duplicating the data. |
