Support Questions
Find answers, ask questions, and share your expertise

Memory Limit Exceeded

Memory Limit Exceeded

New Contributor

After upgrading to Impala 2.0, lots of queries worked under 1.4 fails with "Memory Limit Exceeded", almost immediately after execution starts.

I have tried setting "Impala Daemon Memory Limit" to more than 100GB (I have 128GB RAM per node), but it only fix part of the problem.

 

From the message below, memory consumption is only around 500MB. Even if Impala runs out of memory, the new "spill to disk" feature in 2.0 should kick in.

 

WARNINGS: Memory limit exceeded
Query did not have enough memory to get the minimum required buffers.


Backend 33:Memory Limit Exceeded
Query(ec411e9f24ff25b1:e0b8fa64d0d9daa) Limit: Consumption=508.52 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9db5: Consumption=6.27 MB
UDFs: Consumption=0
SORT_NODE (id=2): Consumption=4.00 KB
AGGREGATION_NODE (id=4): Consumption=6.25 MB
EXCHANGE_NODE (id=3): Consumption=0
DataStreamRecvr: Consumption=0
DataStreamSender: Consumption=4.00 KB
Block Manager: Limit=86.40 GB Consumption=370.25 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9dcd: Consumption=502.25 MB
UDFs: Consumption=60.00 KB
AGGREGATION_NODE (id=1): Consumption=408.00 MB
HDFS_SCAN_NODE (id=0): Consumption=94.09 MB
DataStreamSender: Consumption=96.00 KB
WARNING: The following tables are missing relevant table and/or column statistics.
dns.pdns_query
Backend 36:Memory Limit Exceeded
Query(ec411e9f24ff25b1:e0b8fa64d0d9daa) Limit: Consumption=510.94 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9db8: Consumption=6.27 MB
UDFs: Consumption=0
SORT_NODE (id=2): Consumption=4.00 KB
AGGREGATION_NODE (id=4): Consumption=6.25 MB
EXCHANGE_NODE (id=3): Consumption=0
DataStreamRecvr: Consumption=0
DataStreamSender: Consumption=4.00 KB
Block Manager: Limit=86.40 GB Consumption=330.25 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9dd0: Consumption=504.67 MB
UDFs: Consumption=60.00 KB
AGGREGATION_NODE (id=1): Consumption=368.00 MB
HDFS_SCAN_NODE (id=0): Consumption=136.51 MB
DataStreamSender: Consumption=96.00 KB
WARNING: The following tables are missing relevant table and/or column statistic
dns.pdns_query
Backend 43:Memory Limit Exceeded
Query(ec411e9f24ff25b1:e0b8fa64d0d9daa) Limit: Consumption=508.56 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9dbf: Consumption=6.27 MB
UDFs: Consumption=0
SORT_NODE (id=2): Consumption=4.00 KB
AGGREGATION_NODE (id=4): Consumption=6.25 MB
EXCHANGE_NODE (id=3): Consumption=0
DataStreamRecvr: Consumption=0
DataStreamSender: Consumption=4.00 KB
Block Manager: Limit=86.40 GB Consumption=378.25 MB
Fragment ec411e9f24ff25b1:e0b8fa64d0d9dd7: Consumption=502.30 MB
UDFs: Consumption=60.00 KB
AGGREGATION_NODE (id=1): Consumption=416.00 MB
HDFS_SCAN_NODE (id=0): Consumption=86.14 MB
DataStreamSender: Consumption=96.00 KB
WARNING: The following tables are missing relevant table and/or column statistic
dns.pdns_query

 

Any idea what limited Impala's memory usage, or how can I fix it? Thank you in advance.

3 REPLIES 3

Re: Memory Limit Exceeded

New Contributor

Solved by disabling "Integrated Resource Management with YARN".

 

Noticed from the documentation that table statistics are required for Impala to estimate memory reservation, which we don't have. Or maybe we should manually set a MEM_LIMIT.

Re: Memory Limit Exceeded

Cloudera Employee

With the spill-to-disk feature, memory usage grows during the query as in previous releases.  When the query nears the point where spilling might be necessary, Impala grabs some extra memory to use for that work, so there is a period where more memory is used than in previous releases.  If that extra memory can't be allocated, that's when the "minimum buffers" error occurs.  Once the memory allocation succeeds and the spill-to-disk feature kicks in, memory usage should remain level for the rest of the query.

 

This extra memory allocation is not expected to cause query failures, except in rare cases that were right on the edge of running out of memory before. I'm not sure how the YARN aspect factors into this case, I'll have to leave that for the dev team experts.

 

John

Re: Memory Limit Exceeded

New Contributor

Thank you for your reply.

 

I've conducted more tests today, seems the absence of table/column stats can break lots of SQL queries if YARN is enabled. The dynamic memory allocation/expansion is not working as expected.

 

However, we have dozens of very large tables (tens of TBs), updated daily, "compute stats" just take too much time.

 

 

[slave01:21000] > create table test like pdns_query stored as parquet;
Query: create table test like pdns_query stored as parquet

Fetched 0 row(s) in 0.16s
[slave01:21000] > insert into test select * from pdns_query;
Query: insert into test select * from pdns_query
Inserted 265588331 row(s) in 20.62s
[slave01:21000] > select count(distinct name) from test;
Query: select count(distinct name) from test
WARNINGS: Memory limit exceeded
Query did not have enough memory to get the minimum required buffers.


Backend 26:Memory Limit Exceeded
Query(bf4efcc7d58a89c7:df4aa7cad66c67a8) Limit: Consumption=508.48 MB
  Fragment bf4efcc7d58a89c7:df4aa7cad66c67ac: Consumption=6.28 MB
    UDFs: Consumption=0
    AGGREGATION_NODE (id=2): Consumption=4.00 KB
    AGGREGATION_NODE (id=4): Consumption=6.25 MB
    EXCHANGE_NODE (id=3): Consumption=0
    DataStreamRecvr: Consumption=0
    DataStreamSender: Consumption=16.00 KB
  Block Manager: Limit=80.60 GB Consumption=370.25 MB
  Fragment bf4efcc7d58a89c7:df4aa7cad66c67c4: Consumption=502.21 MB
    UDFs: Consumption=0
    AGGREGATION_NODE (id=1): Consumption=408.00 MB
    HDFS_SCAN_NODE (id=0): Consumption=94.07 MB
    DataStreamSender: Consumption=127.88 KB
WARNING: The following tables are missing relevant table and/or column statistics.
dns.test
Backend 41:Memory Limit Exceeded
Query(bf4efcc7d58a89c7:df4aa7cad66c67a8) Limit: Consumption=508.55 MB
  Fragment bf4efcc7d58a89c7:df4aa7cad66c67bb: Consumption=6.28 MB
    UDFs: Consumption=0
    AGGREGATION_NODE (id=2): Consumption=4.00 KB
    AGGREGATION_NODE (id=4): Consumption=6.25 MB
    EXCHANGE_NODE (id=3): Consumption=0
    DataStreamRecvr: Consumption=0
    DataStreamSender: Consumption=16.00 KB
  Block Manager: Limit=80.62 GB Consumption=370.25 MB
  Fragment bf4efcc7d58a89c7:df4aa7cad66c67d3: Consumption=502.28 MB
    UDFs: Consumption=0
    AGGREGATION_NODE (id=1): Consumption=408.00 MB
    HDFS_SCAN_NODE (id=0): Consumption=94.14 MB
    DataStreamSender: Consumption=127.88 KB
WARNING: The following tables are missing relevant table and/or column statistics.
dns.test

[slave01:21000] > compute stats test;
Query: compute stats test
+-----------------------------------------+
| summary                                 |
+-----------------------------------------+
| Updated 1 partition(s) and 3 column(s). |
+-----------------------------------------+
Fetched 1 row(s) in 8.66s
[slave01:21000] > select count(distinct name) from test;
Query: select count(distinct name) from test
+----------------------+
| count(distinct name) |
+----------------------+
| 165805061            |
+----------------------+
Fetched 1 row(s) in 33.37s