Posts: 7
Registered: ‎09-18-2018

HDFS sorting files in GB

[ Edited ]



I am trying to automate the HDFS user space utilisation mail.

Everything went correct exept the sorting of the files in gb.when i tried with bytes it is giving the correct result but in other it is not coming the expected output.

please help me the get the correct output.

i ll provide the script which i run in bytes as well as human readable -h in hdfs file system.


#getting the current hdfs percentage in numeric value

CURRENT=$(hdfs dfs -df -h/ | grep / | awk '{ print $8}' | sed 's/%//g')

#current hdfs space utilisation
DiskFile=$(hdfs dfs -df -h)

HdfsReport=$(hdfs dfsadmin -report)

Diskuse=$(hdfs dfs -du  /user | sort -nr | head -10)

#To get results GB i have provided $(hdfs dfs -du -h  /user | sort -r | head -10)

if [ "$CURRENT" -gt "$THRESHOLD" ] ; then

mail -s 'HDFS Usage Housekeeping required', << EOF
HDFS usage in Cluster is above the threshold please run the clean-up scripts asap. Used: $CURRENT%

Current disk utilization report is

Please find the Utilisation report of top ten users consuming the cluster



if [ "$CURRENT" -gt "$Critical" ] ; then

mail -s 'HDFS Admin Report', << EOF
HDFS usage in Cluster is above critical storage, please Find the Cluster report below






Please help me to sort out the this.

Thanks in advance

Posts: 7
Registered: ‎09-18-2018

Re: HDFS sorting files in GB

[ Edited ]

Diskuse=$(hdfs dfs -du /user |sort -n -r | awk '{print $1/1099511531398.98," GB",
$2/1099511531398.98," TB ", $3}'| head -10)



Diskuse=$(hdfs dfs -du /user |sort -n -r | awk '{print $1/1073741824," GB", 
$2/1073741824," TB ", $3}'| head -10)