- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark sql query on hive tables
- Labels:
-
Apache Spark
Created ‎05-12-2019 04:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I am have started to experiment the spark client installed in our system but i am getting the below error while running the spark sql
The current setting for
Spark 1.6.1.2.4.2.0-258 built for Hadoop 2.7.1.2.4.2.0-258
spark.driver.maxResultSize =5g
spark.kryoserializer.buffer - 2m
spark.kryoserializer.buffer.max - 256m
/*
org.apache.spark.SparkException: Job aborted due to stage failure: Task 625 in stage 224854.0 failed 4 times, most recent failure: Lost task 625.3 in stage 224854.0 (TID 14802942,xxxxxxxxx): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1596. To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace: */
Created ‎05-12-2019 03:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to increase the kryoserializer buffer value after you initialized spark context/spark session.
change the property name spark.kryoserializer.buffer.max to spark.kryoserializer.buffer.max.mb
conf.set("spark.kryoserializer.buffer.max.mb", "512")
Refer to this and this link for more details regards to this issue.
