Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kryo serialization failed

Kryo serialization failed

Explorer

Team,

Getting below error while running spark job

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, rwlp931.rw.discoverfinancial.com): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 37. To avoid this, increase spark.kryoserializer.buffer.max value. at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:265) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

But i dont see the property in my server.

1 REPLY 1
Highlighted

Re: Kryo serialization failed

@suresh krish

When you see the environmental variables in your spark UI you can see that particular job will be using below property serialization. If you can't see in cluster configuration, that mean user is invoking at the runtime of the job.

<code>spark.serializer        org.apache.spark.serializer.KryoSerializer

Secondly spark.kryoserializer.buffer.max is built inside that with default value 64m. If required you can increase that value at the runtime. Even we can all the KryoSerialization values at the cluster level but that's not good practice without knowing proper use case.

Hope this helps you.

Don't have an account?
Coming from Hortonworks? Activate your account here