Support Questions

jacob_hansraj · ‎05-12-2019

Hi All,

I am have started to experiment the spark client installed in our system but i am getting the below error while running the spark sql

The current setting for

Spark 1.6.1.2.4.2.0-258 built for Hadoop 2.7.1.2.4.2.0-258

spark.driver.maxResultSize =5g

spark.kryoserializer.buffer - 2m

spark.kryoserializer.buffer.max - 256m

/*

org.apache.spark.SparkException: Job aborted due to stage failure: Task 625 in stage 224854.0 failed 4 times, most recent failure: Lost task 625.3 in stage 224854.0 (TID 14802942,xxxxxxxxx): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1596. To avoid this, increase spark.kryoserializer.buffer.max value.

at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Driver stacktrace: */

Shu_ashu · ‎05-12-2019

@Jacob Paul

Try to increase the kryoserializer buffer value after you initialized spark context/spark session.

change the property name spark.kryoserializer.buffer.max to spark.kryoserializer.buffer.max.mb

conf.set("spark.kryoserializer.buffer.max.mb", "512")

Refer to this and this link for more details regards to this issue.

Cloudera Community

Support Questions

Spark sql query on hive tables