Member since
07-17-2018
4
Posts
0
Kudos Received
0
Solutions
08-09-2018
02:33 PM
1 Kudo
@Rinki Flow: 1. You can list out the files from the directory on every first day of month and check the filename attribute using RouteOnAttribute Processor to get only the current date files. In RouteOnAttribute processor you can use either of the above attributes to making use of nifi expression language we can only filtering out only the required files. 2.You can use ReplaceText processor to replace all this required metadata and store into HDFS/Hive..etc i'm thinking filetype is csv,avro,json so i kept expression like ${filename:substringAfter('.')} Replacement Value
${filename},${file.creationTime},${filename:substringAfter('.')},${file.size} To store the data to table you can use PutHDFS and create table on top of this directory. 3.You can use cron schedule to run the processor on first day of month and Execution in only on Primary node - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
07-18-2018
01:37 PM
@Rinki Please start a new forum question. I am probably not best resource for SQL statements. Starting a new question will get you faster response. - Thank you, Matt
... View more