I have been running the Big Data Benchmark w/ scale factor 5 on an Impala cluster launched by Cloudera Director w/ 5 i2.xlarge workers using only default settings.
It was running without a hitch until I hit query 3b, at which point it hit the memory limit (defaulted to ~13 gb) and the query crashed. I raised the memory limit to 20 gb and query 3b worked. Query 3c kept crashing no matter what I set the limit to, and when I completely disabled the memory limit (set it to -1) I got the following profile for the failed query:
I next went and computed the table stats because they had previously been computed, but no matter what I set the memory limit to again it kept erroring (although these I didn't manage to save the query profile for).
I checked the docs here and only saw references for turning off disk spilling, not to get it to happen when the memory limit is hit.
Is this a current limitation of Impala, or have I missed some sort of config?