Reply
Expert Contributor
Posts: 305
Registered: ‎01-25-2017

kryo exception after spark upgrade from 1.5.0 to 1.6.0 ( cloudera 5.5.4 to 5.13.0)

Hi Guys,

 

I'm facing an issue with spark job that failing on kryo exception after the cdh upgrade.

 

The weird thing that i can run the job successfully from spark submit shell but when i run it from oozie it failed with the following exception.

 

Any help is much apprecirated. 

2017-12-26 12:19:51,761 [task-result-getter-0] ERROR org.apache.spark.scheduler.TaskResultGetter  - Exception while getting task result
com.esotericsoftware.kryo.KryoException: Buffer underflow.
	at com.esotericsoftware.kryo.io.Input.require(Input.java:156)
	at com.esotericsoftware.kryo.io.Input.readInt(Input.java:337)
	at com.twitter.chill.java.ArraysAsListSerializer.read(ArraysAsListSerializer.java:60)
	at com.twitter.chill.java.ArraysAsListSerializer.read(ArraysAsListSerializer.java:41)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
	at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:311)
	at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:97)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:60)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1819)
	at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:50)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2017-12-26 12:19:51,767 [dag-scheduler-event-loop] INFO  org.apache.spark.scheduler.cluster.YarnClusterScheduler  - Cancelling stage 1
2017-12-26 12:19:51,770 [dag-scheduler-event-loop] INFO  org.apache.spark.scheduler.cluster.YarnClusterScheduler  - Stage 1 was cancelled
2017-12-26 12:19:51,771 [dag-scheduler-event-loop] INFO  org.apache.spark.scheduler.DAGScheduler  - ResultStage 1 (saveAsHadoopFile at DataaccessLE.scala:284) faile
Announcements