- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
Created on ‎04-17-2014 03:14 AM - edited ‎09-16-2022 01:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a parquet based table and can successfully select it within Hive and Impala,
but if I want to select from that table in shark, I receive the error:
14/04/17 11:33:49 INFO parse.ParseDriver: Parse Completed
14/04/17 11:33:49 INFO parse.SharkSemanticAnalyzer: Get metadata for source tables
FAILED: Hive Internal Error: java.lang.RuntimeException(java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat)
14/04/17 11:33:50 ERROR shark.SharkDriver: FAILED: Hive Internal Error: java.lang.RuntimeException(java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat)
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
at org.apache.hadoop.hive.ql.metadata.Table.getInputFormatClass(Table.java:306)
at org.apache.hadoop.hive.ql.metadata.Table.<init>(Table.java:99)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:988)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:891)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1083)
Where is this class included? what to do/link/install/configure to get rid of the error?
I am using CDH5, parquet libs are in /opt/cloudera/parcels/CDH/lib/parquet
thanks in advance, Gerd
Created ‎04-17-2014 06:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
the previous error at trying to access a parquet based table via shark "java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat" has been resolved by adding
parquet-hive-bundle-1.4.1.jar to shark's lib folder.
Now the Hive metastore can be read successfully (also the parquet based table).
But if I want to select from that table I receive:
org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 4 times (most recent failure: Exception failure: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
This is really strange, since the class org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe is included in the parquet-hive-bundle-1.4.1.jar, too ?!?!
I copied that .jar to both lib folders, shark (/opt/shark/shark-0.9.1/lib) and spark (under /opt/cloudera/parcels...)
...getting more and more confused 😉
any help ?
regards, Gerd
