Support Questions

Find answers, ask questions, and share your expertise

Can't create hive table using Zeppelin 0.7.2 sql / spark. Class not found

Hi, I'm trying to create hive table and query it using %sql.

I tried both: %spark and %sql, nothing helps.

Here is my DDL:

%sql
create external table MY_TABLE row format serde 'com.my.MyAvroSerde' 
with serdeproperties ('serialization.class'='com.my.ContainerProto') 
stored as inputformat 'com.my.ProtoAvroFileFormat'  
LOCATION 'hdfs://my/data'
Thrown exception:
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException java.lang.ClassNotFoundException: Class com.my.ContainerProto not found)
It's confusing since spark paragraph works well with code
%spark
import com.my.ContainerProto 
// bla-bla 
val rdd = sc.newAPIHadoopFile[AvroKey[ByteBuffer], NullWritable, 
AvroKeyInputFormat[ByteBuffer]]("hdfs://my/data")

rdd.map{bytes => ContainerProto.fromBytes(bytes)} 
Code executed and result produced. Why sql or spark paragraph doesn't see my 3rd party jars when I try to create hive table? Spark interpreter has proper configuration for required third party jars.
1 REPLY 1

@Sergey Sheypak

I think the issue is in this line: with serdeproperties ('serialization.class'='com.my.ContainerProto')

You are trying to create a table with external SerDe class specified which is resulting in class not found error. The way to go around this is to add any external class that you are using in code in dependencies list in Spark interpreter

follow steps here to do this : https://zeppelin.apache.org/docs/latest/manual/dependencymanagement.html

Once you do this, restart the interpreter and try to run query with %spark.sql

Hope this helps !!

For more background and information, read through these

https://mail-archives.apache.org/mod_mbox/incubator-zeppelin-users/201601.mbox/%3CCACcq8R74eTEhKu_j7...

https://issues.apache.org/jira/browse/ZEPPELIN-648 (This is resolved now)

https://issues.apache.org/jira/browse/ZEPPELIN-381 (This is resolved now too)