Member since
03-16-2016
7
Posts
8
Kudos Received
0
Solutions
03-21-2016
12:35 AM
1 Kudo
Thanks for your response @Rich Raposa I eventually took a simpler approach to finish the exam (although 15 mins late 🙂 ) I am wondering why using MultipleInputs givesTaggedInputSplit object and not FileSplit.
... View more
03-21-2016
12:16 AM
1 Kudo
There is header in one of the CSV files and I was trying to ignore the first record. Below is my sample code. setup() { Path path = ((FileSplit) split).getPath(); filename = path.getName(); if (filename.equals("flightdata1.csv") { hasheader = true; } } map(....) { if (hasheader) { if (key.get() == 0) return; } ...... ...... }
... View more
03-20-2016
11:35 PM
1 Kudo
In practice exam I got classcast exception. Looks like TaggedInputSplit is not even a public class. How to get the filename when when using MultipleInputs? Mapper setup method: Path path = ((FileSplit) split).getPath(); Driver class: MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, FlightDataMapper.class);
MultipleInputs.addInputPath(job, new Path(args[2]), TextInputFormat.class, WeatherDataMapper.class); Exception: java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit cannot be cast to org.apache.hadoop.mapreduce.lib.input.FileSplit Any help is greatly appreciated. Thanks, Sanjay
... View more
Labels:
- Labels:
-
Apache Hadoop
03-16-2016
03:38 PM
Thanks @Rich Raposa. It was actually little confusing to see ONLY setGroupingComparator mentioned in the objective, while secondary-sort involves writing comparator classes for sorting/grouping and using both setSortComparatorClass and setGroupingComparator methods.
... View more
03-16-2016
02:29 PM
2 Kudos
@Rich Raposa One of the objective in HDPCD:Java exam is to sort "output" of MR job using
http://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/Job.html#setGroupingComparatorClass(java.lang.Class)
My understanding is grouping comparator is for grouping records from multiple partitions. How can this be used for sorting?
Do you mean using setSortComparatorClass? Thanks for your help!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
MapReduce
03-16-2016
12:25 PM
1 Kudo
Thank you very much @Rich Raposa.
... View more
03-16-2016
04:14 AM
2 Kudos
There is an objective in HDPCD:Java exam to use LocaResource in MapReduce Jobs. I haven't come across any example on web to understand this. Appreciate if someone can give me pointers in understanding how LocalResource can be used in MR jobs? How is this different from cache files? http://hortonworks.com/training/class/hdp-certified-java-developer-exam/
... View more
Labels:
- Labels:
-
Apache Hadoop