Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi GetHDFSFileInfo Directory Size

avatar
New Contributor

Very new to NiFi and I am trying to create a daily email with throughput totals from the previous day.   I was getting nowhere figuring out how to capture and report on NiFi processor throughput so I figured I could start with a GetHDFSFileInfo process that could look at the various HDFS tables and report on the previous day's directory size.   Similar to "hdfs dfs -du -h /warehouse/tablespace/managed/hive/table1" and then grabbing the partition directory size from the previous day.   I could script it easy enough but I would like to keep everything in NiFi so I don't have to worry about scripts and cron jobs.

 

When I try to use GetHDFSFileInfo and do not recurse directories, I can get a list of all the partition directories but the length value is always 0.   If I enable recurse, then I get every file (and they have the length value),

 

Just curious if there was a way to have GetHDFSFileInfo provide partition level directory disk usage.

 

Jeff

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Partition level HDFS directory disk usage is not avaible since this works on gievn direceoty path only and not at the disk level. 

Thank you 

 

 

 

View solution in original post

3 REPLIES 3

avatar
Community Manager

@JeffB Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our NiFi experts @MattWho and @steven-matison  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Collaborator

Partition level HDFS directory disk usage is not avaible since this works on gievn direceoty path only and not at the disk level. 

Thank you 

 

 

 

avatar
Community Manager

@JeffB Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: