Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

Spark job fails with below error when byte code for any particular method grows beyond 64KB

spark.sql.codegen.wholeStage is enabled by default for internal optimization in Spark2 which can cause these kind of issues in some corner cases.

Below is the detailed stack trace for your reference:

org.codehaus.janino.JaninoRuntimeException: Code of method "processNext()V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator" grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:949)
at org.codehaus.janino.CodeContext.write(CodeContext.java:857)
at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:11072)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10744)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10729)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3824)
at org.codehaus.janino.UnitCompiler.access$9100(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3796)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3762)
at org.codehaus.janino.Java$LocalVariableAccess.accept(Java.java:3675)
at org.codehaus.janino.Java$Lvalue.accept(Java.java:3563)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3820)
[....] Output truncated
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

.

How to fix this?

This can be fixed by setting spark.sql.codegen.wholeStage=false in custom spark2-defaults configuration via Ambari and restart required services OR adding --conf spark.sql.codegen.wholeStage=false in spark-shell or spark-submit command.

.

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

3,347 Views
0 Kudos
Comments

This configuration is applicable for Spark 2.2.x and above

New Contributor

Hi Team, 

I have upgraded to spark 2.2.1 but spark.sql.codegen.wholeStage=false doesn't give any improvement in performance

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎06-12-2018 12:15 AM
Updated by:
 
Contributors
Top Kudoed Authors