Support Questions
Find answers, ask questions, and share your expertise
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Pig job spilling to disk


Pig job spilling to disk

Expert Contributor

My pig job is running for a long time and then runs out of space on disk. I was able to identify the job and the disk. 


This log file is huge. eventually this disk reaches 100% and job fails. This tmp file under /yarn is 200G . this node (Datanode and Nodemananger) is the where the one reducer is still running. 




How to manage this situation? why does it spill to local disk. 


Re: Pig job spilling to disk

Expert Contributor

We tried adding these parameters and still see it dumps these bags which are lowetr than 1TB.