Reply
Highlighted
New Contributor
Posts: 3
Registered: ‎04-24-2018

Kryo Serializer Not working in CDH-5.14

[ Edited ]

I am writing a Spark Streaming job to read messages from Kafka. The default serializer used is KryoSerializer. When I run the job, I am encountering the below exception

18/10/31 16:54:02 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 5.0 (TID 6, *****, executor 4): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 297. To avoid this, increase spark.kryoserializer.buffer.max value.
   at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:318)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:386)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 297
        at com.esotericsoftware.kryo.io.Output.require(Output.java:163)
        at com.esotericsoftware.kryo.io.Output.writeString_slow(Output.java:462)
        at com.esotericsoftware.kryo.io.Output.writeString(Output.java:363)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:191)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:184)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
        at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
        at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
        at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:366)
        at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:307)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:315)
        ... 4 more

Even after specifying the Kryo Buffer and Kryo Max buffer size, I am encountering this exception. I am facing this exception only in cloudera spark-2.2.0 distribution, but not in Venilla Spark 2.2.0.

The configurations are given below.

 (spark.driver.userClassPathFirst,true)
  (spark.executor.extraLibraryPath,/var/opt/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/lib/native)
  (spark.driver.memory,4g)
  (spark.authenticate,false)
  (spark.yarn.jars,local:/var/opt/cloudera/parcels/SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957/lib/spark2/jars/*)
  (spark.driver.extraLibraryPath,/var/opt/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/lib/native)
  (spark.yarn.historyServer.address,http://****18089)
  (spark.yarn.am.extraLibraryPath,/var/opt/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/lib/native)
  (spark.eventLog.enabled,true)
  (spark.dynamicAllocation.schedulerBacklogTimeout,1)
  (spark.yarn.config.gatewayPath,/var/opt/cloudera/parcels)
  (spark.ui.killEnabled,true)
  (spark.serializer,org.apache.spark.serializer.KryoSerializer)
  (spark.executor.extraJavaOptions,-Djava.security.auth.login.config=/tmp/jaas.conf)
  (spark.dynamicAllocation.executorIdleTimeout,60)
  (spark.dynamicAllocation.minExecutors,0)
  (spark.hadoop.yarn.application.classpath,)
  (spark.shuffle.service.enabled,true)
  (spark.yarn.config.replacementPath,{{HADOOP_COMMON_HOME}}/../../..)
  (spark.ui.enabled,true)
  (spark.io.encryption.enabled,false)
  (spark.driver.extraJavaOptions,-Djava.security.auth.login.config=/tmp/jaas.conf)
  (spark.sql.hive.metastore.version,1.1.0)
  (spark.kryoserializer.buffer.max,512m)
  (spark.submit.deployMode,client)
  (spark.shuffle.service.port,7337)
  (spark.network.crypto.enabled,false)
  (spark.hadoop.mapreduce.application.classpath,)
  (spark.eventLog.dir,hdfs:///user/spark/spark2ApplicationHistory)
  (spark.master,yarn)
  (spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON,/var/opt/cloudera/parcels/Anaconda-4.3.1/bin/python)
  (spark.dynamicAllocation.enabled,true)
  (spark.sql.catalogImplementation,hive)
  (spark.yarn.appMasterEnv.PYSPARK_PYTHON,/var/opt/cloudera/parcels/Anaconda-4.3.1/bin/python)
  (spark.sql.hive.metastore.jars,${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*)

Announcements