Member since
10-12-2021
1
Post
0
Kudos Received
0
Solutions
10-27-2021
02:19 AM
Hello all, We have an application Spark Structured Streaming (version 3.0.1.3.0.7110.0-81) which crashed after 1hours and 15 minutes. This application read data from input kafka topic, run some transfomation and write the result in an another output kafka topic. The application use the mode "foreachBatch" with a trigger interval set to 30 secondes and maxOffsetsPerTrigger set to 7000. We get this exception : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) at java.lang.StringBuilder.append(StringBuilder.java:136) at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:203) at scala.collection.TraversableOnce.$anonfun$addString$1(TraversableOnce.scala:369) at scala.collection.TraversableOnce$$Lambda$90/910599202.apply(Unknown Source) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableOnce.addString(TraversableOnce.scala:362) at scala.collection.TraversableOnce.addString$(TraversableOnce.scala:358) at scala.collection.AbstractTraversable.addString(Traversable.scala:108) at scala.collection.TraversableOnce.mkString(TraversableOnce.scala:328) at scala.collection.TraversableOnce.mkString$(TraversableOnce.scala:327) at scala.collection.AbstractTraversable.mkString(Traversable.scala:108) at scala.collection.TraversableOnce.mkString(TraversableOnce.scala:330) at scala.collection.TraversableOnce.mkString$(TraversableOnce.scala:330) at scala.collection.AbstractTraversable.mkString(Traversable.scala:108) at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.declareAddedFunctions(CodeGenerator.scala:541) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:197) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:39) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1278) at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:170) at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:167) at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:52) at org.apache.spark.sql.catalyst.expressions.SafeProjection$.create(Projection.scala:193) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:180) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:173) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:512) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) We suspect a memory leak and we can see in the heap dump many objects of type java.util.concurrent.ConcurrentHashMap. Thanks for your help / ideas. Laurent
... View more
Labels: