Support Questions

Find answers, ask questions, and share your expertise

Impala very high client fetch time in Hue

Master Collaborator


 in CDH 5.13 some queries have a very high client fetch time (executed from Hue)

This one example here produced just 10 records, but was running for more than 3 hours. I think Hue does not close the fetch procedure, and Impala Daemon thinks the client will fetch for more.. Even though this does not make sense, Impala Daemon "knows" that 100% of records are sent to the client so why does not cancel or close it?


Here are the selected query stats:

Query Type: QUERY
Query State: FINISHED
Start Time: Aug 22, 2018 9:11:03 AM
End Time: Aug 22, 2018 12:26:40 PM
Duration: 3h, 15m
Rows Produced: 10
Admission Result: Admitted (queued)
Admission Wait Time: 5ms
Bytes Streamed: 353 B
Client Fetch Wait Time: 3.3h
Client Fetch Wait Time Percentage: 100
Connected User: hue/xxx
Estimated per Node Peak Memory: 32.0 MiB
HDFS Average Scan Range: 1.3 KiB
HDFS Bytes Read: 1.3 KiB
HDFS Bytes Read From Cache: 0 B
HDFS Bytes Read From Cache Percentage: 0
HDFS Local Bytes Read: 1.3 KiB
HDFS Local Bytes Read Percentage: 100
HDFS Remote Bytes Read: 0 B
HDFS Remote Bytes Read Percentage: 0
HDFS Scanner Average Read Throughput: 0 B/s
HDFS Short Circuit Bytes Read: 1.3 KiB
HDFS Short Circuit Bytes Read Percentage: 100
Impala Version: impalad version 2.10.0-cdh5.13.3 RELEASE (build 15a453e15865344e75ce0fc6c4c760696d50f626)
Out of Memory: false
Per Node Peak Memory Usage: 197.1 KiB
Planning Wait Time: 1ms
Planning Wait Time Percentage: 0
Pool: root.pool1
Query Status: OK
Session ID: 9647e779051c0b0b:302f01f9698839ba
Session Type: HIVESERVER2
Statistics Corrupt: false
Statistics Missing: true
Threads: CPU Time: 13ms
Threads: CPU Time Percentage: 78
Threads: Network Receive Wait Time: 0ms
Threads: Network Receive Wait Time Percentage: 0
Threads: Network Send Wait Time: 1ms
Threads: Network Send Wait Time Percentage: 11
Threads: Storage Wait Time: 1ms
Threads: Storage Wait Time Percentage: 11

I have couple of questions:

 - is this a problem on Impala or in Hue side?

 - the impala has idle_session_timeout=7200 configured. Why did not closed the IDaemon the session after 2 hours of inactivity?

 - is this hanging query occupying a "slot" in resource pools - affecting Max Running Queries in Impala admission control? (My observation is yes, just want to be sure) 




Master Collaborator

One more observation:

 during the query "fetch time" the query on Impala daemons is reported as:
"waiting to be closed"

But has a state=FINISHED, First row fetched, Scan progress 100%.


So my additional question is why is Impala not closing automatically the queries when the state is in "FINISHED"? Is this a configurable behaviour? 



Master Collaborator

Edit: adding query timeout does not affect this behaviour:


Configured Hue to 30sec timeout, but the query is waiting to be closed for more than 2 minutes...


This is directly from the Query profile:

    Query Options (set by configuration): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30
    Query Options (set by configuration and planner): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30,MT_DOP=0