Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How Map and Reduce operations are actually carried out

avatar
Explorer

Hi,

 

I have gone through this question., can anyone pls tell me the correct answer with explanation?

 

Which best describes how TextInputFormat processes input files and line breaks?


A. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader
of the split that contains the beginning of the broken line.
B. Input file splits may cross line breaks. A line that crosses file splits is read by the
RecordReaders of both splits containing the broken line.
C. The input file is split exactly at the line breaks, so each RecordReader will read a series of
complete lines.
D. Input file splits may cross line breaks. A line that crosses file splits is ignored.
E. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader
of the split that contains the end of the broken line.

 

 

Thanks in advance

1 ACCEPTED SOLUTION

avatar
Mentor
Check http://stackoverflow.com/a/14540272 perhaps, which includes an
example.

View solution in original post

8 REPLIES 8

avatar
Mentor

avatar
Explorer
Hi Harish,

Thanks for the reply.
I had gone through the same link. But can you please explain me the same, I couldn't understand.

avatar
Mentor
What part of it was not clear specifically? Could you quote, so it can be
explained further?

avatar
Explorer
" For example TextInputFormat will read the last line of the FileSplit past the split boundary and, when reading other than the first FileSplit, TextInputFormat ignores the content up to the first newline." .. what it means?

avatar
Mentor
It just means that when the split offset (starting point) is 0, i.e. start
of file, we read the first line. Otherwise (non-zero offsets/starting
point/mid-points of file) we arbitrarily skip the first line, because we
know that the previous split's reader reads always an extra line at the end.

Does this help?

avatar
Explorer
Sorry harish, can you please explain in detail with small example.

avatar
Mentor
Check http://stackoverflow.com/a/14540272 perhaps, which includes an
example.

avatar
Explorer
Thanks Harish