11-07-2018 01:52 PM
My pig job is running for a long time and then runs out of space on disk. I was able to identify the job and the disk.
This log file is huge. eventually this disk reaches 100% and job fails. This tmp file under /yarn is 200G . this node (Datanode and Nodemananger) is the where the one reducer is still running.
How to manage this situation? why does it spill to local disk.