Support Questions

Find answers, ask questions, and share your expertise

Spark (Standalone) error local class incompatible: stream classdesc serialVersionUID

avatar
Explorer

I'm trying to use spark (standalone) to load data onto hive tables. The avro schema is successfully, I see (on spark ui page) that my applications are finished running, however the applications are in the Killed state.

 

THIS IS THE STDERR.LOG ON THE SPARK WEB UI PAGE VIA CLOUDERA MANAGER:

 

15/03/25 06:15:58 ERROR Executor: Exception in task 1.3 in stage 2.0 (TID 10)
java.io.InvalidClassException: org.apache.spark.rdd.PairRDDFunctions; local class incompatible: stream classdesc serialVersionUID = 8789839749593513237, local class serialVersionUID = -4145741279224749316
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
    at org.apache.spark.scheduler.Task.run(Task.scala:56)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
15/03/25 06:15:59 ERROR CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@HadoopNode01.local:48707] -> [akka.tcp://sparkDriver@HadoopNode02.local:54550] disassociated! Shutting down.

 

 

Any help will be greatly appreciated.


Thanks

 

 

22 REPLIES 22

avatar
Explorer
Hi yes I basically uninstalled chd them reinstalled 5.3.0. I used the
latest cloudera manager and rather than doing a packet install I selected
the package option and from there I could choose to install chd 5.3.0. I
selected the option without sparc then added the standalone version to the
cluster from the cluster control menu after. I also re downloaded the BDD
software as Oracle updated the BDD install package last week. Even though
it still says 1.0 the new software resolved a few other issues.

avatar
New Contributor

Hello Dataminer, Riche 78, Many thanks for your help. I am struggling since last two weeks on how to fix this issue. I have also installed CDH 5.0.1 now  and it is working fine. 

avatar
New Contributor

Hi! I've got the same error message and I solved using the latest elasticsearch-spark version to my corresponding scala version:

 

spark-submit --packages org.elasticsearch:elasticsearch-spark-20_2.11:6.2.4 your_script.py

 

Hope it helps.