Trying to build a ml model using Pyspark on spark on YARN cluster mode
Having 120 GB of RAM, with ample amount of storage.
Yet on nb.fit(df) line of training code throws OutOfMemoryException, have tried all the memory tuning parameter in YARN, with multiple options of spawning executor(Increasing executor/ decreasing executor)
But, at the end fails at OOM exception.
Tried decreasing the features/ decreasing the data size, yet it ends up with the memory issue.
Only successful model creation was with 1000 records .
Have tried so much of things but there is no result, please help.
@sridar1992 I'm not an expert but I did find this community article that may be of interest if you haven't read it yet.