Reply
Explorer
Posts: 25
Registered: ‎07-21-2015
Accepted Solution

How Map and Reduce operations are actually carried out

Hi,

 

I have gone through this question., can anyone pls tell me the correct answer with explanation?

 

Which best describes how TextInputFormat processes input files and line breaks?


A. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader
of the split that contains the beginning of the broken line.
B. Input file splits may cross line breaks. A line that crosses file splits is read by the
RecordReaders of both splits containing the broken line.
C. The input file is split exactly at the line breaks, so each RecordReader will read a series of
complete lines.
D. Input file splits may cross line breaks. A line that crosses file splits is ignored.
E. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader
of the split that contains the end of the broken line.

 

 

Thanks in advance

Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How Map and Reduce operations are actually carried out

Explorer
Posts: 25
Registered: ‎07-21-2015

Re: How Map and Reduce operations are actually carried out

Hi Harish,

Thanks for the reply.
I had gone through the same link. But can you please explain me the same, I couldn't understand.

Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How Map and Reduce operations are actually carried out

What part of it was not clear specifically? Could you quote, so it can be
explained further?

Explorer
Posts: 25
Registered: ‎07-21-2015

Re: How Map and Reduce operations are actually carried out

" For example TextInputFormat will read the last line of the FileSplit past the split boundary and, when reading other than the first FileSplit, TextInputFormat ignores the content up to the first newline." .. what it means?
Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How Map and Reduce operations are actually carried out

It just means that when the split offset (starting point) is 0, i.e. start
of file, we read the first line. Otherwise (non-zero offsets/starting
point/mid-points of file) we arbitrarily skip the first line, because we
know that the previous split's reader reads always an extra line at the end.

Does this help?
Explorer
Posts: 25
Registered: ‎07-21-2015

Re: How Map and Reduce operations are actually carried out

Sorry harish, can you please explain in detail with small example.
Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How Map and Reduce operations are actually carried out

Check http://stackoverflow.com/a/14540272 perhaps, which includes an
example.

Highlighted
Explorer
Posts: 25
Registered: ‎07-21-2015

Re: How Map and Reduce operations are actually carried out

Thanks Harish
Announcements