04-21-2015 08:23 AM
After my MapReduce job is done, the following timing data are generated.
Total time spent by all maps in occupied slots (ms)=9859
Total time spent by all reduces in occupied slots (ms)=10878
Total time spent by all map tasks (ms)=9859
Total time spent by all reduce tasks (ms)=10878
Total vcore-seconds taken by all map tasks=9859
Total vcore-seconds taken by all reduce tasks=10878
Total megabyte-seconds taken by all map tasks=10095616
Total megabyte-seconds taken by all reduce tasks=11139072
I have put timing statements in my driver code before & after I submit and return from a job. The wall clock time that I get is different from the sum of the counters: Total time spent by all maps in occupied slots (ms)+Total time spent by all reduces in occupied slots (ms). I am not sure why this is so. Can somebody please explain what the above timing counters exactly measure and why I see the difference in timing data?
08-19-2015 12:21 PM
My understanding is that the time spent listed in the counters is the clock time it took for each map or reduce task took for each task added together.
I have 10 Map Tasks running in parallel and they all take 2.5 seconds to complete the counter would output:
Does that help?