Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

Highlighted

Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

New Contributor

My cluster is currently running on Hortonworks HDP version: 2.3.2.0-2950

I followed the instructions to setup Zeppelin using Spark 1.4.1 here: http://hortonworks.com/hadoop-tutorial/apache-zeppelin/

The first code block is the code I run, the second code block is the error I receive. Reading online (Here), it looks like the current version of Zeppelin 0.6.0-SNAPSHOT is compiled with the wrong version of Jackson. Does anyone know if there's a workaround to overcome this error?

Thanks!

Kirk

Code:

z.load("/path/to/sqljdbc42.jar")
import org.apache.spark.sql.{DataFrame, SQLContext}

val sqlDbUrl = "jdbc:sqlserver://sqlserverUrl;username=user;password=password;database=db"
val data = sqlContext.load("jdbc", Map("url" -> sqlDbUrl, "dbtable" -> "(select * from data) as data", "driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver"))

data.show(20)

Error:

com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
 at [Source: {"id":"3","name":"map"}; line: 1, column: 1]
	at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
	at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)
	at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)
	at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)
	at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)
	at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)
	at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)
	at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)
	at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
	at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82)
	at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490)
	at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490)
	at scala.Option.map(Option.scala:145)
	at org.apache.spark.rdd.RDD.<init>(RDD.scala:1490)
	at org.apache.spark.rdd.RDD.<init>(RDD.scala:98)
	at org.apache.spark.rdd.MapPartitionsRDD.<init>(MapPartitionsRDD.scala:24)
	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:295)
	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:293)
....
5 REPLIES 5
Highlighted

Re: Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

Mentor

@Kirk Allen in your z.load statement, try the following, sql server jar is not available through maven

%dep

z.load("net.sourceforge.jtds:jtds:1.2.2")

more information here link provided by @azeltov.

Highlighted

Re: Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

New Contributor

Thanks for the response. The problem isn't that the SQL Server driver doesn't load properly, I'm able to load that correctly using z.load("/path/to/sqljdbc42.jar"), I know this because I was receiving a cannot find suitable driver exception before I added the z.load command. Also, spark-submit using the above commands works fine, it appears to be related to the com.fasterxml.json.databind deserializer.

Highlighted

Re: Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

Mentor

@Kirk Allen have you looked on your classpath for conflicting libraries, specifically related to com.fasterxml? Try to load the dependency for that as well?

Highlighted

Re: Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

@Kirk Allen can you try building the zeppelin with correct spark version explicitly specified: I just received similar error on my custom zeppelin build. using hdp2.4 and spark 1.6.0. This is how i build my zeppelin: mvn package install -Pspark-1.6 -DskipTests

You will also need to set the configuration setting in : zeppelin must be started with the SPARK_HOME environment variable properly set. The best way to do this is by editing conf/zeppelin-env.sh.

You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml and set the right attributes

Re: Serialization Error with Zeppelin Tech Preview 0.6.0 and Spark 1.4.1

Explorer

I was facing same issue , I solved it as follow (Not sure weather it is correct way or not)

1. Removed all jars related to *Jackson*

rm /usr/hdp/2.5.3.0-37/zeppelin/lib/jack*

2. Restart Zepplin

Don't have an account?
Coming from Hortonworks? Activate your account here