My Sqoop jobs and Hive queries are randomly getting killed:
All I get in job logs in:
Diagnostics: Application killed by a user.
I know for sure that these jobs are not being killed by anyone.
My RM is running in HA mode. All my services are up and running without any warnings. I don't think it could be a memory issue since I can see my servers have available memory and it happens even when very few jobs are running.Please help.
It could be a memory issue, and it does not relate to the condition of the server at all. The containers running on YARN have their max memory size, and the NodeManager carefully watch over. When the container allocates more memory, then it gets killed. Also keep in mind that the container size is the whole JVM, so if it is not 100% for map or reduce data.
For example on Spark you have to calculate how big container do you need if you want to use X MB of memory for cache and Y MB of memory for code.