Issue: -
Facing error while compiling some of the codes in Pyspark
Errors: -
GC
overhead limit exceeded
java.lang.OutOfMemoryError:
Java heap space
Please find below the information which i have collected from the current yarn-site.xml & mapred-site.xml
yarn.scheduler.maximum-allocation-mb - 65536
yarn.scheduler.minimum-allocation-mb - 4096
mapreduce.map.java.opts = Xmx2048m
mapreduce.map.memory.mb = 2560
mapreduce.reduce.java.opts = Xmx4096m
mapreduce.reduce.memory.mb = 5120