Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS sorting files in GB

avatar
Explorer

Hi,

 

I am trying to automate the HDFS user space utilisation mail.

Everything went correct exept the sorting of the files in gb.when i tried with bytes it is giving the correct result but in other it is not coming the expected output.

please help me the get the correct output.

i ll provide the script which i run in bytes as well as human readable -h in hdfs file system.


#!/bin/bash

#getting the current hdfs percentage in numeric value

CURRENT=$(hdfs dfs -df -h/ | grep / | awk '{ print $8}' | sed 's/%//g')

#current hdfs space utilisation
DiskFile=$(hdfs dfs -df -h)

HdfsReport=$(hdfs dfsadmin -report)

Diskuse=$(hdfs dfs -du  /user | sort -nr | head -10)

#To get results GB i have provided $(hdfs dfs -du -h  /user | sort -r | head -10)
THRESHOLD=70
Critical=90

if [ "$CURRENT" -gt "$THRESHOLD" ] ; then

mail -s 'HDFS Usage Housekeeping required' @ABC.com, @ABC.com << EOF
HDFS usage in Cluster is above the threshold please run the clean-up scripts asap. Used: $CURRENT%

Current disk utilization report is
$DiskFile

Please find the Utilisation report of top ten users consuming the cluster

$Diskuse



EOF
fi

if [ "$CURRENT" -gt "$Critical" ] ; then

mail -s 'HDFS Admin Report' yy@abc.com, yyy@abc.com << EOF
HDFS usage in Cluster is above critical storage, please Find the Cluster report below

$HdfsReport

 

EOF
fi

 

 

Please help me to sort out the this.

Thanks in advance

1 REPLY 1

avatar
Explorer

Diskuse=$(hdfs dfs -du /user |sort -n -r | awk '{print $1/1099511531398.98," GB",
$2/1099511531398.98," TB ", $3}'| head -10)

 

 

Diskuse=$(hdfs dfs -du /user |sort -n -r | awk '{print $1/1073741824," GB", 
$2/1073741824," TB ", $3}'| head -10)