Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to define concrete resource consumption for certain user within Resource Pool

avatar
Rising Star

hi dear experts!

 

i do have a challenge. 

i do have a dynamuc service pool, let's say root.marketing.

Many users, who belong to this pool is submitting jobs on it (Bob, Alice, Tom).

i want to know resource consumption for each of the users.

like for the last day Bob used in average 33 cores, Alice 12, Tom 118... or something like this.

in other words want to know who consume what within the same pool

 

thanks!

1 ACCEPTED SOLUTION

avatar
Master Collaborator

@fil Here you GO ...

 

#!/bin/bash

 

STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`

result=`curl -s http://resource_manager1:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
fi
#echo $result
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|coreSeconds" | awk ' /user/ { user = $2 }
/vcoreSeconds/ { arr[user]+=$2 ; }
END { for (x in arr) {print "yarn." x ".cpums="arr[x]} } '


echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|memorySeconds" | awk ' /user/ { user = $2 }
/memorySeconds/ { arr1[user]+=$2 ; }
END { for (y in arr1) {print "yarn." y ".memorySeconds="arr1[y]} } '

View solution in original post

7 REPLIES 7

avatar
Look into the charts of Cloudera Manager, it is not exactly there, but you can quickly figure out who consumed the most cpu (vcores) and memory

avatar
Rising Star

Thank you Thomas! do you mean some concreate charts? 🙂

I checked Cloudera Manager -> YARN -> Resource Pools

there indeed lots of useful charts, but it shows pool consumption.

for example it could be pool root.marketing, but within thi pool it could be multiple users.

so, i want to have a understanding which users consume which resources.

avatar
Champion

@fil

 

You can get this report from Cloudera Navigator. Search by user id and apply filter as needed

avatar
Master Collaborator

@fil I would suggest you to create a shell script that pick this data from Yarn resource manager .

 

I created for me a shell script and pick the data on a daily basis and aggregare the memory and CPU time for each pool, for sure you can do that per user and even job if needed.

 

See below:

 

Note: i grap some of the code which related to my data center and i may delete something that will result the script to fail in your side, try to play around.

 

Let me know if you need any more help with that, for sure you can change here the queue with user.

 

 

#!/bin/bash

STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`

result=`curl -s http://yarn_resource_manager:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://yarn_resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`


echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "queue|coreSeconds" | awk -v ' /queue/ { queue = $2 }
/vcoreSeconds/ { arr[queue]+=$2 ; }
END { for (x in arr) {print ".yarn." x ".cpums="arr[x]} } '

echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "queue|memorySeconds" | awk ' /queue/ { queue = $2 }
/memorySeconds/ { arr1[queue]+=$2 ; }
END { for (y in arr1) {print ".yarn." y ".memorySeconds="arr1[y]} } '

avatar
Rising Star

@Fawze

thanks for script...

unfortunitely it reflects some error for me:

./users_resource_cons.sh: line 14: syntax error: unexpected end of file

avatar
Master Collaborator

@fil Here you GO ...

 

#!/bin/bash

 

STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`

result=`curl -s http://resource_manager1:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
fi
#echo $result
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|coreSeconds" | awk ' /user/ { user = $2 }
/vcoreSeconds/ { arr[user]+=$2 ; }
END { for (x in arr) {print "yarn." x ".cpums="arr[x]} } '


echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|memorySeconds" | awk ' /user/ { user = $2 }
/memorySeconds/ { arr1[user]+=$2 ; }
END { for (y in arr1) {print "yarn." y ".memorySeconds="arr1[y]} } '

avatar
Rising Star

awesome! thank you!