Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Hdfs and map reduce word count?
Labels:
- Labels:
-
HDFS
New Member
Created on 11-29-2015 10:25 AM - edited 09-16-2022 02:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
can anyone explain pr direct my to material explaining how map reduce word count works? I do not understand how it can work!
If a file is split into blocks and distributed over multiple nodes how can the word count program work? The file/text can be split in the middle of a word fx "be" in one block and "tween" in another block. How can the map reduce job count between as word if it is split over multiple blocks and nodes??
If a file is split into blocks and distributed over multiple nodes how can the word count program work? The file/text can be split in the middle of a word fx "be" in one block and "tween" in another block. How can the map reduce job count between as word if it is split over multiple blocks and nodes??
1 ACCEPTED SOLUTION
Guru
Created 11-29-2015 11:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The data format will ensure records are intact before being sent to the
mapper function. I believe this is done by sending partial records to the
machine they will be mapped on (so the overwhelming majority of the data is
processed in place, but half a line per block or so may still be
exchanged). Tom White's book The Definitive Guide To Hadoop does a good job
of covering details like this.
mapper function. I believe this is done by sending partial records to the
machine they will be mapped on (so the overwhelming majority of the data is
processed in place, but half a line per block or so may still be
exchanged). Tom White's book The Definitive Guide To Hadoop does a good job
of covering details like this.
2 REPLIES 2
Guru
Created 11-29-2015 11:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The data format will ensure records are intact before being sent to the
mapper function. I believe this is done by sending partial records to the
machine they will be mapped on (so the overwhelming majority of the data is
processed in place, but half a line per block or so may still be
exchanged). Tom White's book The Definitive Guide To Hadoop does a good job
of covering details like this.
mapper function. I believe this is done by sending partial records to the
machine they will be mapped on (so the overwhelming majority of the data is
processed in place, but half a line per block or so may still be
exchanged). Tom White's book The Definitive Guide To Hadoop does a good job
of covering details like this.
New Member
Created 11-29-2015 12:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok very cool. Thank you for the reference.