Support Questions

ungarala · ‎03-20-2016

In practice exam I got classcast exception. Looks like TaggedInputSplit is not even a public class. How to get the filename when when using MultipleInputs?

Mapper setup method:

Path path = ((FileSplit) split).getPath();

Driver class:

MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, FlightDataMapper.class); MultipleInputs.addInputPath(job, new Path(args[2]), TextInputFormat.class, WeatherDataMapper.class);

Exception:

java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit cannot be cast to org.apache.hadoop.mapreduce.lib.input.FileSplit

Any help is greatly appreciated.

Thanks,

Sanjay

rich1 · ‎03-21-2016

An easier solution would be to read every row, and then add a check to see if it's a header row by checking one of the columns that you know only appears in a header row.

View solution in original post

rich1 · ‎03-21-2016

I'm not sure why you are getting that exception, but I do know that you do not need to get the path of any input files on the exam. What are you trying to do? Showing more of your code might help me to provide more insight.

ungarala · ‎03-21-2016

There is header in one of the CSV files and I was trying to ignore the first record. Below is my sample code.

setup() {

Path path = ((FileSplit) split).getPath();

filename = path.getName();

if (filename.equals("flightdata1.csv") {

hasheader = true;

}

map(....) {

if (hasheader) {

if (key.get() == 0) return;

}

......

}

rich1 · ‎03-21-2016

An easier solution would be to read every row, and then add a check to see if it's a header row by checking one of the columns that you know only appears in a header row.

ungarala · ‎03-21-2016

Thanks for your response @Rich Raposa

I eventually took a simpler approach to finish the exam (although 15 mins late 🙂 )

I am wondering why using MultipleInputs givesTaggedInputSplit object and not FileSplit.

rich1 · ‎03-21-2016

There's a discussion here that answers your question.

http://stackoverflow.com/questions/11130145/hadoop-multipleinputs-fails-with-classcastexception

Cloudera Community

Support Questions

HDP Certified Java Developer practice exam