Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDP Certified Java Developer practice exam

avatar
Explorer

In practice exam I got classcast exception. Looks like TaggedInputSplit is not even a public class. How to get the filename when when using MultipleInputs?

Mapper setup method:

Path path = ((FileSplit) split).getPath();

Driver class:

MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, FlightDataMapper.class); MultipleInputs.addInputPath(job, new Path(args[2]), TextInputFormat.class, WeatherDataMapper.class);

Exception:

java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit cannot be cast to org.apache.hadoop.mapreduce.lib.input.FileSplit

Any help is greatly appreciated.

Thanks,

Sanjay

1 ACCEPTED SOLUTION

avatar
Guru

An easier solution would be to read every row, and then add a check to see if it's a header row by checking one of the columns that you know only appears in a header row.

View solution in original post

5 REPLIES 5

avatar
Guru

I'm not sure why you are getting that exception, but I do know that you do not need to get the path of any input files on the exam. What are you trying to do? Showing more of your code might help me to provide more insight.

avatar
Explorer

There is header in one of the CSV files and I was trying to ignore the first record. Below is my sample code.

setup() {

Path path = ((FileSplit) split).getPath();

filename = path.getName();

if (filename.equals("flightdata1.csv") {

hasheader = true;

}

}

map(....) {

if (hasheader) {

if (key.get() == 0) return;

}

......

......

}

avatar
Guru

An easier solution would be to read every row, and then add a check to see if it's a header row by checking one of the columns that you know only appears in a header row.

avatar
Explorer

Thanks for your response @Rich Raposa

I eventually took a simpler approach to finish the exam (although 15 mins late 🙂 )

I am wondering why using MultipleInputs givesTaggedInputSplit object and not FileSplit.

avatar
Guru