It depends on the complexity of the query - that number is per-plan-node, not global. It may need more memory if there are a lot of operators in the plan. Hard to know without seeing the plan or execution summary (or both).
Historically there were various bugs that might result in this happening in certain cases but I believe all the fixes landed in 5.10.
I agree that the message and behaviour could be a lot more helpful/actionable - improving spill-to-disk is actually my primary focus right now - I'm very excited about the changes we have in the pipeline.
That query probably has multiple big joins and aggregations and needs more memory to complete. A very rough rule of thumb for minimum memory in releases CDH5.9-CDH5.12 is the following.
If you add all those up and add another 25% you'll get a ballpark number for how much memory the query will require to execute.
I'm working on reducing those numbers and making the system give a clearer yes/no answer on whether it can run the query before it starts executing.