We are trying to run a Spark job on YARN. The problem is that usercache direcory(yarn/nm/usercache/
) is growing too fast and it will fill all of the disk. The size of Hive table is around 70G and total free disk space around 300G. In usercache direcory there are a lot of big folders like blockmgr-b5b55c6f-ef8a-4359-93e4-9935f2390367.
My squestions are:
- Is it normal or something is wrong with my Spark or YARN?
- If these files are just cached files is there any way to limit the cache size?
Thanks for your help.