Created on 08-11-2017 01:44 AM - edited 09-16-2022 05:04 AM
As per the title really. I'm seeing high memory usage on a few data nodes ,which is down to impala, however at the time there are no active impala queieres running on the nodes.
If I restart the impalad the memory will clear and remaing free until a query runs. I can monitor the query using the memory, but when it finishes the memory is not released back to the system.
Created 08-11-2017 07:55 AM
It's normal for idle Impala daemons to hold onto 1-2GB of memory plus the JVM heap memory. Any more than that may indicate something wrong.
The memz debug page has diagnostics for this: http://impala-daemon:25000/memz?detailed=true . You can see there if a fragment of a query is holding onto memory.
There are two common ways this can happen:
* The query wasn't cancelled and closed. In this case it should show up on the /queries page of the coordinator
* The query was cancelled and closed, but a fragment continued running (this is a bug). In that case you should see a running fragment-execution thread in /threadz on the impala daemon web page.
Created 08-11-2017 06:15 AM
As far as I understand how Impala works, that is the expected behaviour.
It is indeed intended for speeding up later queries that use the same sets of data.
Created 08-11-2017 07:55 AM
It's normal for idle Impala daemons to hold onto 1-2GB of memory plus the JVM heap memory. Any more than that may indicate something wrong.
The memz debug page has diagnostics for this: http://impala-daemon:25000/memz?detailed=true . You can see there if a fragment of a query is holding onto memory.
There are two common ways this can happen:
* The query wasn't cancelled and closed. In this case it should show up on the /queries page of the coordinator
* The query was cancelled and closed, but a fragment continued running (this is a bug). In that case you should see a running fragment-execution thread in /threadz on the impala daemon web page.
Created 08-11-2017 07:57 AM
Thank you for the detailed answer