- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to define concrete resource consumption for certain user within Resource Pool
- Labels:
-
Apache YARN
Created on ‎10-04-2018 06:50 AM - edited ‎09-16-2022 06:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi dear experts!
i do have a challenge.
i do have a dynamuc service pool, let's say root.marketing.
Many users, who belong to this pool is submitting jobs on it (Bob, Alice, Tom).
i want to know resource consumption for each of the users.
like for the last day Bob used in average 33 cores, Alice 12, Tom 118... or something like this.
in other words want to know who consume what within the same pool
thanks!
Created ‎10-05-2018 06:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@fil Here you GO ...
#!/bin/bash
STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`
result=`curl -s http://resource_manager1:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
fi
#echo $result
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|coreSeconds" | awk ' /user/ { user = $2 }
/vcoreSeconds/ { arr[user]+=$2 ; }
END { for (x in arr) {print "yarn." x ".cpums="arr[x]} } '
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|memorySeconds" | awk ' /user/ { user = $2 }
/memorySeconds/ { arr1[user]+=$2 ; }
END { for (y in arr1) {print "yarn." y ".memorySeconds="arr1[y]} } '
Created ‎10-04-2018 07:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-05-2018 06:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Thomas! do you mean some concreate charts? 🙂
I checked Cloudera Manager -> YARN -> Resource Pools
there indeed lots of useful charts, but it shows pool consumption.
for example it could be pool root.marketing, but within thi pool it could be multiple users.
so, i want to have a understanding which users consume which resources.
Created ‎10-04-2018 01:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-04-2018 11:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@fil I would suggest you to create a shell script that pick this data from Yarn resource manager .
I created for me a shell script and pick the data on a daily basis and aggregare the memory and CPU time for each pool, for sure you can do that per user and even job if needed.
See below:
Note: i grap some of the code which related to my data center and i may delete something that will result the script to fail in your side, try to play around.
Let me know if you need any more help with that, for sure you can change here the queue with user.
#!/bin/bash
STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`
result=`curl -s http://yarn_resource_manager:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://yarn_resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "queue|coreSeconds" | awk -v ' /queue/ { queue = $2 }
/vcoreSeconds/ { arr[queue]+=$2 ; }
END { for (x in arr) {print ".yarn." x ".cpums="arr[x]} } '
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "queue|memorySeconds" | awk ' /queue/ { queue = $2 }
/memorySeconds/ { arr1[queue]+=$2 ; }
END { for (y in arr1) {print ".yarn." y ".memorySeconds="arr1[y]} } '
Created ‎10-05-2018 06:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for script...
unfortunitely it reflects some error for me:
./users_resource_cons.sh: line 14: syntax error: unexpected end of file
Created ‎10-05-2018 06:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@fil Here you GO ...
#!/bin/bash
STARTDATE=`date -d " -1 day " +%s%N | cut -b1-13`
ENDDATE=`date +%s%N | cut -b1-13`
result=`curl -s http://resource_manager1:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
if [[ $result =~ "standby RM" ]]; then
result=`curl -s http://resource_manager2:8088/ws/v1/cluster/apps?finishedTimeBegin=$STARTDATE&finishedTimeEnd=$ENDDATE`
fi
#echo $result
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|coreSeconds" | awk ' /user/ { user = $2 }
/vcoreSeconds/ { arr[user]+=$2 ; }
END { for (x in arr) {print "yarn." x ".cpums="arr[x]} } '
echo $result | python -m json.tool | sed 's/["|,]//g' | grep -E "user|memorySeconds" | awk ' /user/ { user = $2 }
/memorySeconds/ { arr1[user]+=$2 ; }
END { for (y in arr1) {print "yarn." y ".memorySeconds="arr1[y]} } '
Created ‎10-05-2018 07:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
awesome! thank you!
