Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Class not found error while reading ORC file in MapReduce

Class not found error while reading ORC file in MapReduce

Super Collaborator

Hi guys,

I have an external table in Hive(2.1.1) which has the data stored as ORC file. Now, I want to read this ORC file from Mapper class using ORCInputFormat class. I have added these dependencies in maven for ORC along with other required jars(hadoop and hive) for running the MapReduce application. The hadoop version is 2.7.3

<dependency>  
<groupId>org.apache.orc</groupId>  
<artifactId>orc-mapreduce</artifactId> 
<version>1.2.3</version>  
</dependency>
<dependency>
<groupId>org.apache.orc</groupId>
<artifactId>orc-core</artifactId>
<version>1.2.3</version>
</dependency>

While running the job, I am getting this error:

FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoClassDefFoundError: org/apache/hadoop/hive/common/io/DiskRange at org.apache.orc.OrcFile.createReader(OrcFile.java:227)

I searched through Hive javadocs and found out that this class has to be in hive-common-2.1.1.jar. On extracting, I found out that this is not present over there, although API docs shows that it is a concrete class. Please help guys. Thanks a lot

1 REPLY 1

Re: Class not found error while reading ORC file in MapReduce

Super Collaborator

It is done. Was referring wrong jar. The correct one is hive-exec-2.1.1.jar. The data coming out from MapReduce is correct.

Don't have an account?
Coming from Hortonworks? Activate your account here