This is the second time that this happens to me..
I have a cron job that is computing sessions.
If in the same time, I am running some other jobs,
from time to time the cron job is arriving in a state where
- It has only one map to do
- It already started the reducers
- The workers are doing nothing (CPU is very low, and memory is normal)
I waited twice the time the job would normally finished, and it didn't advanced,
even though stopped doing other queries on the cluster.
I run afterwards, the cron, (without doing anything on the cluster) and it finished in the normal amount of time (1h30)
How should I configure yarn in order to make him stop arriving in this state?
Why is this happening?
I think I found the reason why this is happening:
By default mapreduce.job.reduce.slowstart.completedmaps is 0.8.