There are 250 tasks in my mapreduce job. I would to retrieve take time and size of read/write data for each task.
Take time : I can have all in resource manager.
Size of data : I can click each task in resource manager and get he size of data reading and writing.
Is there any way or tool I can collect the information of data size quickly? Not have to go through each link of task.
For each mapper you can log split size (or output it to seprate file using multioutputformat) in setup function with something like this