Support Questions

Find answers, ask questions, and share your expertise

Kryo serialization failed: Buffer overflow

avatar
New Contributor

I am getting the org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow when I am execute the collect on 1 GB of RDD(for example : My1GBRDD.collect).

When I am execution the same thing on small Rdd(600MB), It will execute successfully. The problem with above 1GB RDD.

For more details please refer the following steps which I do.

1. Create RDD of input file.

2. mapToPair on the RDD.

3. groupByKey() on the RDD.

4. collectAsMap on the RDD.

 

On the 4th step I got the SparkException as follows,

org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 37
Serialization trace:
otherElements (org.apache.spark.util.collection.CompactBuffer). To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:350)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:393)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 37

 

2 REPLIES 2

avatar
Explorer

Hi , 

How did you solve this issue , i have the same.

 

avatar
New Contributor

Increase spark.kryoserializer.buffer.max property value value according to the required size , by default it is 64 MB

 

Got same Exception, ran job by increasing the value and was able to run it properly.