My cluster memory is 147GB and I get this error when the server has not used it's entire memory.
I can see there is memory free and yet my jobs get killed with this error. There is no error in logs and I don't get any error using dmesg command or in /var/log/messages
Also, it happens randomly and on any of the nodes. Please suggest. Been trying to get in touch with Cloudera sales support but no luck and it's urgent.
These are not spark jobs but hive and sqoop jobs I am running. These randomly get killed throughout the day, with the same configuration sometimes run and sometimes don't.