Support Questions
Find answers, ask questions, and share your expertise

Facing error while compiling some of the codes in Pyspark

Facing error while compiling some of the codes in Pyspark

New Contributor

Issue: - Facing error while compiling some of the codes in Pyspark

Errors: -

GC overhead limit exceeded

java.lang.OutOfMemoryError: Java heap space

Please find below the information which i have collected from the current yarn-site.xml & mapred-site.xml

yarn.scheduler.maximum-allocation-mb - 65536

yarn.scheduler.minimum-allocation-mb - 4096

mapreduce.map.java.opts = Xmx2048m mapreduce.map.memory.mb = 2560 mapreduce.reduce.java.opts = Xmx4096m mapreduce.reduce.memory.mb = 5120

1 REPLY 1

Re: Facing error while compiling some of the codes in Pyspark

Expert Contributor

@Koushik Dey

Are you getting out of memory in executor ? If yes, then you can increase the properties like spark.executor.memory, or set argument --executor-memory if using pyspark shell. But one thing I would like to point out here that whenever you are doing map-side aggregate or caching or shuffling in memory, these are taking more memory away from what is need for computations, so tuning your job according to your computation will fix this issue. Below link you can get more spark tuning details.

http://spark.apache.org/docs/latest/tuning.html#garbage-collection-tuning