Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Master Guru

Spark job fails with below error when byte code for any particular method grows beyond 64KB

spark.sql.codegen.wholeStage is enabled by default for internal optimization in Spark2 which can cause these kind of issues in some corner cases.

Below is the detailed stack trace for your reference:

org.codehaus.janino.JaninoRuntimeException: Code of method "processNext()V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator" grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:949)
at org.codehaus.janino.CodeContext.write(CodeContext.java:857)
at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:11072)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10744)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10729)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3824)
at org.codehaus.janino.UnitCompiler.access$9100(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3796)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3762)
at org.codehaus.janino.Java$LocalVariableAccess.accept(Java.java:3675)
at org.codehaus.janino.Java$Lvalue.accept(Java.java:3563)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3820)
[....] Output truncated
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

.

How to fix this?

This can be fixed by setting spark.sql.codegen.wholeStage=false in custom spark2-defaults configuration via Ambari and restart required services OR adding --conf spark.sql.codegen.wholeStage=false in spark-shell or spark-submit command.

.

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

9,131 Views
0 Kudos
Comments

This configuration is applicable for Spark 2.2.x and above

Hi Team, 

I have upgraded to spark 2.2.1 but spark.sql.codegen.wholeStage=false doesn't give any improvement in performance