- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Mapreduce program to display word having highest count in file
- Labels:
-
MapReduce
Created on 11-06-2014 10:09 PM - edited 09-16-2022 02:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Can you help me on this ? I want to write map reduce program which will display a word which repeated highest time in file.
Any way to modify wordcount mapreduce program to display only single row as word, # of count
Thanks
Sach
Created 11-30-2014 05:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. First, use MR to form a word count. Split each line into words and count them as 1 each in Mapper, and aggregate they counts by word as key in Reducer.
2. Second, use yet another subsequent MR job to read and invert the key to sort it in the opposite form, wherein the key is now the count and the value are the words that match the count. Run either with a TotalOrderPartitioner or a single reducer to get your final result.
Created 11-08-2014 06:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
it seems you still don't know the main concept of map and reduce. basically, your question is very easy.
maybe you know, every output from map will be sorted by word automatically, so as the original output of course is not suit to you.
for example you have these original words like below:
"i love love love you and love love you"
then the output after map will be like this:
and 1
i 1
love 5
you 2
so after you get these words input in Reduce, you just save the number, and compare them, after you get the max, then output.
Created 02-22-2021 06:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
that output comes after the reduce function not map
Created 11-30-2014 05:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. First, use MR to form a word count. Split each line into words and count them as 1 each in Mapper, and aggregate they counts by word as key in Reducer.
2. Second, use yet another subsequent MR job to read and invert the key to sort it in the opposite form, wherein the key is now the count and the value are the words that match the count. Run either with a TotalOrderPartitioner or a single reducer to get your final result.
