Created 10-17-2024 04:58 AM
I would like to know how to obtain the values of total_io_mb, mapreduce_inputBytes, and mapreduce_outputBytes for each application within a certain period of time.
I believe that these values are written to the yarn log file. So, I would like to know SME's advice on how to do this, that is, how to configure these values to be written to the log.
By Cloudera Manager, we are not able to get but the rate(ex. yarn_application_hdfs_bytes_read_rate). However, alternative ideas are welcome.
The purpose is to evaluate the disk IO of the entire CDH cluster, which is made up of several hundred bare metal machines, and to estimate the disk IO required for the next generation of bare metal and/or storage nodes.
Created 10-29-2024 11:53 PM
Created 10-17-2024 08:49 PM
@yoshio_ono Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our YARN experts @AKR @Xiaomin @srathore who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 10-29-2024 11:53 PM
Hi @yoshio_ono ,
Please check this article https://my.cloudera.com/knowledge/How-to-calculate-memory-and-v-core-utilization-for-each-of-the?id=....
Created 10-30-2024 08:52 PM
Thanks. I guess it is not for total_io_mb but for "memory and v-core utilization".
Disk io should be in yarn logs. I will try seek it.