We have an end-user who runs queries usually with Impala/HUE. He is receiving this error, which I assume will require some changes in tuning parameters. But where do we start to investigate here? The error message is:
ExecQueryFInstances rpc query_id=e74ef8d9b9215369:4994cbde00000000 failed: Failed to get minimum memory reservation of 68.00 MB on daemon chia1.haas.berkeley.edu:22000 for query e74ef8d9b9215369:4994cbde00000000 because it would exceed an applicable memory limit. Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error. Memory usage: Process: Limit=97.35 GB Total=80.62 GB Peak=85.72 GB Buffer Pool: Free Buffers: Total=0 Buffer Pool: Clean Pages: Total=4.24 GB Buffer Pool: Unused Reservation: Total=-4.24 GB Free Disk IO Buffers: Total=1.41 GB Peak=1.99 GB RequestPool=root.hue: Total=78.60 GB Peak=82.56 GB Query(2f4b5cff11212907:886aa1400000000): Reservation=77.88 GB ReservationLimit=77.88 GB OtherMemory=731.77 MB Total=78.60 GB Peak=78.92 GB Query(e74ef8d9b9215369:4994cbde00000000): Reservation=0 ReservationLimit=77.88 GB OtherMemory=0 Total=0 Peak=0 RequestPool=root.mdevaan: Total=0 Peak=18.23 GB RequestPool=root.bergquist: Total=0 Peak=84.21 GB RequestPool=root.saqib: Total=0 Peak=8.93 GB Untracked Memory: Total=631.52 MB
This is an OOM (Out If Memory) error, it simply means that this query needs more memory to be completed; it usually occurs when the cluster is in charge. The request has exceeded the existing memory.
The solution is to add more memory to your nodes or to add more nodes. Otherwise, try to optimize your queries, and you can make sure that your impala setting is optimal too.
While I would normally assume this is true, the error is occurring on queries that were previously successful and completing without errors. Something has changed the memory consumption behavior and the error message received hasn't been particularly helpful for troubleshooting. I will continue the investigation.
Thank you for your assessment, which is spot-on.
I think that the adjustments to MEMORY_LIMIT plus previously setting the idle session and query limits to 10m helped with the prior OOM problem, especially when multiple users are active. We are relatively new to Hadoop/Impala and so we are still learning a lot about how to configure and work with the data within the environment, in tuning the performance, and tailoring the configuration to work well within the limits of the available hardware.
In general, I wish there were documentation that specifically address errors and performance tuning. It would provide more insights into troubleshooting problems and interpreting the errors received. Maybe there is a book already published that's good for this? It would be a huge help!