Support Questions

Jais · ‎06-27-2015

I am receiving following error while inserting data into parquet table

FAILED: SemanticException Class not found: com.cloudera.impala.hive.serde.ParquetInputFormat

Please help me to resolve this issue, also not able to find out the jar that is missing for this class

alex.behm · ‎06-30-2015

Looks like you are running into this issue:

https://issues.cloudera.org/browse/IMPALA-2048

I'd suggest you give the workaround a try.

We've identified the issue and fixed it.

View solution in original post

Jais · ‎06-27-2015

I am using CDH -5.4.2,

Sreesankar · ‎06-28-2015

Hi,

Does your table have partitions? What is the flow of creation of the tables? Could you give some more details, We are facing same issue in two different flows. There is a reference of this bug having been fixed in version 4.7.x,

We do not have this error in CDH 5.2.x but seeing this error atleast since 5.4.0

If you can share some more informatin we could combine this issue together to request Cloudera's attention to this.

-Sreesankar

Jais · ‎06-28-2015

Hi ,

Yes we have partitions on tables, it looks like after you run invaliate metadata on impala followed by compute stats we are getting this error. The tables that are partitioned and in parquet file is in parquet file format and once doing compute stats on impala we are not able to access them from hive.

In our previous CDH version which was 5.3.2 it was working fine and tables were accessible from both hive and impala after running compute stats too. But in the latest version of CDH 5.4.2 it looks like a bug, If cloudera can help us it will be a great plus to stick too CDH 5.4.2 or else we have to think for other options.

cnjuguna · ‎06-30-2015

Hi,

I am having this same issue using Cloudera 5.4.2. The error appears even without running the invalidate metadata or compute stats from impala. I have dropped and recreated the table even with a different name to just be sure there wasn't some residual metadata that was causing this. I am using partitions too.

I ended up having to create the table and partitions and insert data using impala. Surprisingly, hive cannot select data from the table either. Same error.

alex.behm · ‎06-30-2015

Looks like you are running into this issue:

https://issues.cloudera.org/browse/IMPALA-2048

I'd suggest you give the workaround a try.

We've identified the issue and fixed it.

cnjuguna · ‎07-03-2015

Okay first of all my problem is exactly the same. What had somehow escaped me is that the insert I was running was actually inserting records selected from another parquet table. The error of course was from hive being unable to read the source parquet table.

Tried the workaround and I can confirm it is working. It took me a while to realize that you have to recreate the partitions in the new table otherwise you get no output. It does introduce some warnings, though, as below:

hive> select * from tbl_ptr limit 1;
OK
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
<RECORD OUTPUT OKAY>
Time taken: 0.387 seconds, Fetched: 1 row(s)
hive> quit;
Jul 3, 2015 10:36:20 AM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
Jul 3, 2015 10:36:20 AM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 17636531 records.
Jul 3, 2015 10:36:20 AM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block
Jul 3, 2015 10:36:21 AM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 755 ms. row count = 17636531

Jais · ‎07-07-2015

It works with the workaround, but it will be good if we can use tables and data from both impala and hive like we used to in our previous versions. It will avoid us from creating multiple similar tables, one for hive and another for impala.

If it is fixed then its a great thing, so now we need to reinstall CDH 5.4.2 again to resolve this issue ?

alex.behm · ‎07-08-2015

I agree completely that this is a critical issue, and I appreciate your patience in this matter.

The fix will be shipped as part of CDH 5.4.4 tentatively scheduled for the beginning of August.

TJank · ‎08-10-2015

I am still seeing this issue with CDH 5.4.4-1.cdh5.4.4.p0.4

Is this still an issue or should it be resolved with my version of CDH?

Thanks,

Tom

Cloudera Community

Support Questions

ERROR FAILED: SemanticException Class not found: com.cloudera.impala.hive.serde.ParquetInputForma

Hive UDF --Class Not Found Error

Oozie server fails to start with class not found

Oozie jobs are failing with class not found error

Hive start failed because of ambari error: mysql-...

error in oozie Class org.apache.oozie.action.hado...

Hive kafka integration giving error class not foun...

SemanticException Error retrieving udf

Kafka class not found error when running Atlas hoo...

Solr DataImporter class not found

SparkSession class not found in spark-sql-2.1