Reply
tan
New Contributor
Posts: 1
Registered: ‎04-21-2015

Hadoop Map Reduce Time Counter

Hi ,

 

After my MapReduce job is done, the following timing data are generated. 

 

Total time spent by all maps in occupied slots (ms)=9859
Total time spent by all reduces in occupied slots (ms)=10878
Total time spent by all map tasks (ms)=9859
Total time spent by all reduce tasks (ms)=10878
Total vcore-seconds taken by all map tasks=9859
Total vcore-seconds taken by all reduce tasks=10878
Total megabyte-seconds taken by all map tasks=10095616
Total megabyte-seconds taken by all reduce tasks=11139072

 

 

 I have put timing statements in my driver code before & after I submit and return from a job. The wall clock time that I get is different from the sum of  the counters: Total time spent by all maps in occupied slots (ms)+Total time spent by all reduces in occupied slots (ms). I am not sure why this is so. Can somebody please explain what the above timing counters exactly measure and why I see the difference in timing data?

 

Thanks!

Cloudera Employee
Posts: 13
Registered: ‎08-19-2015

Re: Hadoop Map Reduce Time Counter

My understanding is that the time spent listed in the counters is the clock time it took for each map or reduce task took for each task added together. 

For Example:

 

I have 10 Map Tasks running in parallel and they all take 2.5 seconds to complete the counter would output:

10tasks*2500ms=25000ms

 

Does that help?

Announcements