Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎10-16-2017

Missing class in cdh versions of avro-mapred.jar

[ Edited ]

The class org.apache.avro.mapreduce.AvroRecordReaderBase contains a reference to org.apache.avro.hadoop.io.AvroSerialization. (Not to be confused with another class by the same name in the mapred package.) The jar file avro-mapred-1.7.6.jar contains a definition of that AvroSerialization class, but avro-mapred-1.7.6-cdh5.12.1.jar does not. The  avro-tools-1.7.6-cdh5.12.1.jar does contain a definition of the class I need, but it also includes a lot of other unrelated packages (e.g. amazonaws) that cause conflicts. The same is true of the cdh5.10.1 version.

 

When I try to use the avro-mapred-1.7.6.jar (without the cdh-5), I run into other errors at run time:

 

java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

 

I am testing this using a minicluster with the 1.7.6-cdh5.12.1 version of the other hadoop files.

 

What should I do to resolve this reference? I'd like to avoid building avro-mapred.jar from source myself.

 

Update: I see now that the AvroSerialization class is in the jar from the cloudera tarball, avro-1.7.6-cdh5.12.1/dist/java/avro-mapred-1.7.6-cdh5.12.1-hadoop2.jar, but it is not in the version I had gotten from https://mvnrepository.com/artifact/org.apache.avro/avro-mapred/1.7.6-cdh5.12.1

Announcements