Support Questions
Find answers, ask questions, and share your expertise

Spark sql query on hive tables

Spark sql query on hive tables

Hi All,

I am have started to experiment the spark client installed in our system but i am getting the below error while running the spark sql

The current setting for

Spark built for Hadoop

spark.driver.maxResultSize =5g

spark.kryoserializer.buffer - 2m

spark.kryoserializer.buffer.max - 256m


org.apache.spark.SparkException: Job aborted due to stage failure: Task 625 in stage 224854.0 failed 4 times, most recent failure: Lost task 625.3 in stage 224854.0 (TID 14802942,xxxxxxxxx): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1596. To avoid this, increase spark.kryoserializer.buffer.max value.

at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)

at org.apache.spark.executor.Executor$

at java.util.concurrent.ThreadPoolExecutor.runWorker(

at java.util.concurrent.ThreadPoolExecutor$


Driver stacktrace: */


Re: Spark sql query on hive tables

Super Guru

@Jacob Paul

Try to increase the kryoserializer buffer value after you initialized spark context/spark session.

change the property name spark.kryoserializer.buffer.max to spark.kryoserializer.buffer.max.mb

conf.set("spark.kryoserializer.buffer.max.mb", "512")

Refer to this and this link for more details regards to this issue.

Don't have an account?