Member since
01-08-2021
12
Posts
2
Kudos Received
0
Solutions
04-12-2021
12:15 AM
Hi, thank you. However, this is the mechanism of every List processor: it keeps track of the files already processed. Instead, I was wondering if there was a way to start a trigger or a notification to the flow when new ones arrive in the blob. Thanks
... View more
04-09-2021
09:54 PM
Hello According to the documentation related to the state management, it will only pull the new files compared to last run https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-azure-nar/1.5.0/org.apache.nifi.processors.azure.storage.ListAzureBlobStorage/ State management: Scope Description CLUSTER After performing a listing of blobs, the timestamp of the newest blob is stored. This allows the Processor to list only blobs that have been added or modified after this date the next time that the Processor is run. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node can pick up where the previous node left off, without duplicating the data.
... View more
03-08-2021
10:10 AM
yes it's quite right! i add a volumes on docker compose that connect directory on VM to docker-compose. Thank you so much!
... View more
01-21-2021
07:06 AM
@Lallagreta Make sure you do not have any line returns in the values for your dynamic properties added in the UpdateAttribute processor. When you click on the value field for each property you should not see a line "2". For example: Above would result in the value assigned to the FlowFile Attribute having a line return. If this is the case, edit the properties value(s) to remove the line returns so you only see one line (1). Hope this helps, Matt
... View more
01-11-2021
05:54 AM
You must have the reader incorrectly configured for your CSV schema.
... View more
01-11-2021
05:52 AM
2 Kudos
@Lallagreta You should be able to define the filename, or change the filename to what you want. That said the filename doesnt dictate the type, so you can have parquet saved as .txt. One recommendation I have is to use parquet command line tools during the testing of your use case. This is the best way to validate that files are looking right, have the right schema, and right results. https://pypi.org/project/parquet-tools/ I apologize i do not have any exact samples, but from my recall of a year ago, you should be able to get simple commands to check schema of a file, and another command to show the data results. You may have to copy your hdfs file to local file system to inspect them from command line. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more