Created 06-02-2023 02:15 PM
Very new to NiFi and I am trying to create a daily email with throughput totals from the previous day. I was getting nowhere figuring out how to capture and report on NiFi processor throughput so I figured I could start with a GetHDFSFileInfo process that could look at the various HDFS tables and report on the previous day's directory size. Similar to "hdfs dfs -du -h /warehouse/tablespace/managed/hive/table1" and then grabbing the partition directory size from the previous day. I could script it easy enough but I would like to keep everything in NiFi so I don't have to worry about scripts and cron jobs.
When I try to use GetHDFSFileInfo and do not recurse directories, I can get a list of all the partition directories but the length value is always 0. If I enable recurse, then I get every file (and they have the length value),
Just curious if there was a way to have GetHDFSFileInfo provide partition level directory disk usage.
Jeff
Created 06-05-2023 04:32 AM
Partition level HDFS directory disk usage is not avaible since this works on gievn direceoty path only and not at the disk level.
Thank you
Created 06-02-2023 03:39 PM
@JeffB Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @MattWho and @steven-matison who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 06-05-2023 04:32 AM
Partition level HDFS directory disk usage is not avaible since this works on gievn direceoty path only and not at the disk level.
Thank you
Created 06-08-2023 08:08 AM
@JeffB Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,