Member since
07-21-2015
25
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5883 | 07-27-2015 03:28 AM |
09-20-2015
06:55 AM
Hi Harish, thanks for your reply. I have another doubt to ask you, how can we determine the no of mappers in the above mentioned wordcount programme. Can we determine that only using those 2 input files a.txt & b.txt ??. Is it mandatory that we should know the file size & block size? Please help...
... View more
09-20-2015
12:43 AM
HI, Normally as per the i/p I mentioned we should get the o/p as f 4 g 2 h 6 ... ... r 4 But I need only the o/p as last key & its sum..ie 'r ' & its sum as 4. How can we achieve this , anyway can we get only last key & its count as o/p.?
... View more
09-12-2015
11:29 PM
i, I have been trying to do a word count programme, which emmits only 1 key value , ie the last key value pair in the input file using wordcount mapreduce programme. Here is the content of the input file in a directory : a.txt : ==== f f g h i i j k l l m r f f h h Content of b.txt ======== r r g h h h m m c c b b d d r f O/p should be : r 4 Here is my sample mapper code & reducer code for simple word count. Can anyone tell me what changes should I make to get th o/p like above : Mapper code: -------------------- public class WcMapper extends Mapper<LongWritable,Text,Text,IntWritable>{ private static final IntWritable one= new IntWritable(1); private final Text word=new Text(); public void map(LongWritable key,Text value, Context context ) throws IOException, InterruptedException { StringTokenizer st =new StringTokenizer(value.toString()); while(st.hasMoreTokens()){ word.set(st.nextToken()); context.write(word, one); } } } Reducer code : --------------------- public void reduce(Text key,Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException{ int sum=0; for(IntWritable value:values){ sum+= value.get(); } context.write(key, new IntWritable(sum)); } } Driver code:: ----------------------- public class WcDriver extends Configured implements Tool{ public static void main(String[] args) throws Exception { int status = ToolRunner.run(new WcDriver(), args); System.exit(status); } @Override public int run(String[] args) throws Exception { Configuration c1=new Configuration(); Job j1= new Job(c1,"woc"); j1.setJarByClass(WcDriver.class); j1.setMapperClass(WcMapper.class); j1.setReducerClass(WcReducer.class); j1.setInputFormatClass(TextInputFormat.class); j1.setOutputFormatClass(TextOutputFormat.class); j1.setOutputKeyClass(Text.class); j1.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(j1, new Path(args[0])); FileOutputFormat.setOutputPath(j1, new Path(args[1])); FileSystem fs = FileSystem.newInstance(c1); if (fs.exists(new Path(args[1]))) { fs.delete(new Path(args[1]), true); } return j1.waitForCompletion(true) ? 0 : 1; } } Appreciate all help.Please help....
... View more
Labels:
- Labels:
-
MapReduce
08-18-2015
01:01 AM
Thanks Harish
... View more
08-11-2015
05:32 AM
Sorry harish, can you please explain in detail with small example.
... View more
08-11-2015
04:31 AM
" For example TextInputFormat will read the last line of the FileSplit past the split boundary and, when reading other than the first FileSplit, TextInputFormat ignores the content up to the first newline." .. what it means?
... View more
08-11-2015
04:06 AM
Hi Harish, Thanks for the reply. I had gone through the same link. But can you please explain me the same, I couldn't understand.
... View more
08-10-2015
11:34 PM
Hi, I have gone through this question., can anyone pls tell me the correct answer with explanation? Which best describes how TextInputFormat processes input files and line breaks? A. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the beginning of the broken line. B. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReaders of both splits containing the broken line. C. The input file is split exactly at the line breaks, so each RecordReader will read a series of complete lines. D. Input file splits may cross line breaks. A line that crosses file splits is ignored. E. Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the end of the broken line. Thanks in advance
... View more
07-24-2015
12:39 AM
Thanks Harish. I have seen the same in the site.. Its telling it is used to create hive table, if there is existing table job will fail.Right? So my doubt is if we don't use --create-hove-table in the 1st command, it should create a hive table right using "--hive-import"? So what is the significance of using --create-hive-table command?? Please correct if I am wrong. Thanks for your help Harish..
... View more