I'm looking for a smart solution for monitoring hdfs folder quotas.
I know how to get the quota of a single directory (hadoop fs -count -q /path/to/directory), and can also do a recursive script, but on very large scale hdfs it is not efficient.
Has anyone used or know of a smart \ efficient solution for this ?
Or a way to show all folders that have quotas ?
Thanks in advance
Apart from the command line, we can view this information from Cloudera Manager. Please Navigate to Cloudera Manager > HDFS > File Browser and click Directory you are interested to see quota usage.
Also for a consolidated report did you try checking disk usage reports available in CM? We can download the usage report in CSV format for offline analysis. Please refer to below link for more details.
I don't think there is a way to retrieve this information via API rest.
Make a python script, from there you should be able to retrieve the quota (https://pyhdfs.readthedocs.io/en/latest/pyhdfs.html).
You can try to configure a custom ambari alert whit the script.
Let me know if you can because it could be useful to many people.