Created on 02-02-2017 05:21 AM - edited 09-16-2022 04:00 AM
Hi,
I have an issue after upgrade of cluster and migrate to a newer hardware. The same query "Select part_col, count(*) from table group by part_col" runs significantly slower on the new environment. In this case there are two clusters, one 4 nodes (CDH 5.4 Impala 2.2), one with 5 nodes (CDH5.8 Impala 2.6 kerberized). The worker nodes are newer and has more data disks, the processor is the same Xeon E5-2640 v3 vs E5-2650 v2.
The connectivity is 1GBit (old) vs 10Gbit(new).
So the old cluster has less nodes, less disks per node (5 vs 16), less RAM, but still manages to execute the query faster than the new cluster. I digged into the profile of the queries and found out that the step AGGREGATE takes much longer and consumes all the exec time. And in the new version it has a tag "STREAMING". The HDFS scan is in this case more or less the same, but on larger table much faster on the new cluster. So I assume this is not an IO issue.
I assume that during AGGREGATE the data is already in RAM and hashes are computed by CPU..
Does it mean that the new cluster has less powerfull CPU?
Or what other factors can affect the AGGREGATE (STREAMING) duration?
Is there any other tools or papers describing how to tune Impala in these situations?
Thanks for any advice
OLD CLUSTER - Impala 2.2 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 119.314us 119.314us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 4 435.85us 588.641us 5 -1 8.02 MB 0 04:AGGREGATE 4 255.672ms 272.347ms 5 -1 3.15 MB 128.00 MB FINALIZE 03:EXCHANGE 4 62.142us 87.60us 20 -1 0 0 HASH(part_col) 01:AGGREGATE 4 2s803ms 3s270ms 20 -1 3.17 MB 128.00 MB 00:SCAN HDFS 4 53.678ms 56.787ms 103.50M -1 1.82 MB 88.00 MB default.table NEW CLUSTER - Impala 2.6 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 82.553us 82.553us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 5 434.217us 491.025us 5 -1 8.02 MB 0 04:AGGREGATE 5 279.461ms 316.351ms 5 -1 2.30 MB 128.00 MB FINALIZE 03:EXCHANGE 5 28.378us 55.658us 25 -1 0 0 HASH(part_col) 01:AGGREGATE 5 3s965ms 4s739ms 25 -1 1.59 MB 128.00 MB STREAMING 00:SCAN HDFS 5 83.393ms 118.345ms 103.50M -1 12.04 MB 88.00 MB default.table
Created 02-03-2017 03:04 PM
It's not entirely obvious to me what changed. Since it's grouping by timestamp, you're running into https://issues.cloudera.org/browse/IMPALA-3884, which is forcing it to execute the aggregation in an interpreted mode. I suspect that the performance of that interpreted code path may have changed a bit - we tend to focus most of our tuning on the codegened code path.
On releases with IMPALA-3884 fixed it should be a *lot* faster.
" ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI"
Created 02-02-2017 10:04 AM
Can you share the explain plan and the query? I'm interested to know what data types and aggregate functions you're using.
We're not aware of any performance regressions between those versions - we actually made a number of improvements, e.g. changing the hash table implementation and adding software prefetching. It's possible that your workload hit one of the edge cases where these changes had a detrimental effect.
Created 02-02-2017 10:08 AM
If you want to experiment you could try setting the query options "PREFETCH_MODE=0" and "DISABLE_STREAMING_PREAGGREGATIONS=1". You could also try adjusting NUM_SCANNER_THREADS.
With the last option, we changed our recommendation recently to set it to the # of logical cores on the machine (it was 3x logical cores originally - we changed the default in a later release). You can set the default via the "default query options" impalad startup option.
Created 02-03-2017 01:24 AM
I tried to disable prefetch and streaming, but did not helped.
Here is the profile of old Impala on old cluster:
+---------------------+----------+
| part_col | count(*) |
+---------------------+----------+
| 2017-01-01 00:00:00 | 18428899 |
| 2017-01-02 00:00:00 | 20974112 |
| 2017-01-03 00:00:00 | 13178213 |
| 2017-01-04 00:00:00 | 26768773 |
| 2017-01-05 00:00:00 | 24148741 |
+---------------------+----------+
Query Runtime Profile:
Query (id=a7442ed1a43b8e8d:8ad38fb17f29baa0):
Summary:
Session ID: f94cc4d4c075f287:175ee4069895a6b0
Session Type: BEESWAX
Start Time: 2017-01-31 09:07:47.514559000
End Time: 2017-01-31 09:07:51.159487000
Query Type: QUERY
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.2.0-cdh5.4.8 RELEASE (build 137d99e9f751c454a0c79f3b00302938e4984f9c)
User: hadoop
Connected User: hadoop
Delegated User:
Network Address: ::ffff:xx.xx.xx.xx:33929
Default Db: default
Sql Statement: select part_col, count(*) from perf_test group by part_col order by 1
Coordinator: worker2.some.domain.net:22000
Plan:
----------------
Estimated Per-Host Requirements: Memory=256.00MB VCores=2
WARNING: The following tables are missing relevant table and/or column statistics.
default.perf_test
F02:PLAN FRAGMENT [UNPARTITIONED]
05:MERGING-EXCHANGE [UNPARTITIONED]
order by: part_col ASC
hosts=4 per-host-mem=unavailable
tuple-ids=2 row-size=24B cardinality=unavailable
F01:PLAN FRAGMENT [HASH(part_col)]
DATASTREAM SINK [FRAGMENT=F02, EXCHANGE=05, UNPARTITIONED]
02:SORT
| order by: part_col ASC
| hosts=4 per-host-mem=0B
| tuple-ids=2 row-size=24B cardinality=unavailable
|
04:AGGREGATE [FINALIZE]
| output: count:merge(*)
| group by: part_col
| hosts=4 per-host-mem=128.00MB
| tuple-ids=1 row-size=24B cardinality=unavailable
|
03:EXCHANGE [HASH(part_col)]
hosts=4 per-host-mem=0B
tuple-ids=1 row-size=24B cardinality=unavailable
F00:PLAN FRAGMENT [RANDOM]
DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=03, HASH(part_col)]
01:AGGREGATE
| output: count(*)
| group by: part_col
| hosts=4 per-host-mem=128.00MB
| tuple-ids=1 row-size=24B cardinality=unavailable
|
00:SCAN HDFS [default.perf_test, RANDOM]
partitions=1/1 files=31 size=6.89GB
table stats: unavailable
column stats: unavailable
hosts=4 per-host-mem=88.00MB
tuple-ids=0 row-size=16B cardinality=unavailable
----------------
Estimated Per-Host Mem: 268435456
Estimated Per-Host VCores: 2
Tables Missing Stats: default.perf_test
Admission result: Admitted immediately
Request Pool: root.hadoop
ExecSummary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
---------------------------------------------------------------------------------------------------------------------
05:MERGING-EXCHANGE 1 119.314us 119.314us 5 -1 0 -1.00 B UNPARTITIONED
02:SORT 4 435.85us 588.641us 5 -1 8.02 MB 0
04:AGGREGATE 4 255.672ms 272.347ms 5 -1 3.15 MB 128.00 MB FINALIZE
03:EXCHANGE 4 62.142us 87.60us 20 -1 0 0 HASH(part_col)
01:AGGREGATE 4 2s803ms 3s270ms 20 -1 3.17 MB 128.00 MB
00:SCAN HDFS 4 53.678ms 56.787ms 103.50M -1 1.82 MB 88.00 MB default.perf_test
Planner Timeline: 3.584ms
- Analysis finished: 1.132ms (1.132ms)
- Equivalence classes computed: 1.236ms (103.831us)
- Single node plan created: 1.823ms (587.16us)
- Distributed plan created: 2.300ms (477.253us)
- Lineage info computed: 2.664ms (363.854us)
- Planning finished: 3.584ms (919.713us)
Query Timeline: 3s649ms
- Start execution: 42.805us (42.805us)
- Planning finished: 4.458ms (4.415ms)
- Submit for admission: 4.775ms (317.243us)
- Completed admission: 4.838ms (63.556us)
- Ready to start remote fragments: 5.352ms (514.114us)
- Remote fragments started: 511.206ms (505.853ms)
- Rows available: 3s612ms (3s101ms)
- First row fetched: 3s639ms (26.709ms)
- Unregister query: 3s644ms (5.608ms)
ImpalaServer:
- ClientFetchWaitTimer: 30.950ms
- RowMaterializationTimer: 72.117us
Execution Profile a7442ed1a43b8e8d:8ad38fb17f29baa0:(Total: 3s608ms, non-child: 0ns, % non-child: 0.00%)
Fragment start latencies: count: 8, last: 0.231475ns, min: 0.160659ns, max: 0.273263ns, mean: 0.221710ns, stddev: 0.041526ns
Per Node Peak Memory Usage: worker4.some.domain.net:22000(8.41 MB) worker1.some.domain.net:22000(8.10 MB) worker3.some.domain.net:22000(8.41 MB) worker2.some.domain.net:22000(8.23 MB)
- FinalizationTimer: 0ns
Coordinator Fragment F02:(Total: 3s100ms, non-child: 251.349us, % non-child: 0.01%)
MemoryUsage(500.0ms): 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB
ThreadUsage(500.0ms): 1
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 24.00 KB (24576)
- PerHostPeakMemUsage: 0
- PrepareTime: 49.521us
- RowsProduced: 5 (5)
- TotalCpuTime: 3s634ms
- TotalNetworkReceiveTime: 0ns
- TotalNetworkSendTime: 0ns
- TotalStorageWaitTime: 0ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 96 (96)
- BlocksRecycled: 1 (1)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 38.40 GB (41231687680)
- PeakMemoryUsage: 8.01 MB (8396800)
- TotalBufferWaitTime: 0ns
- TotalEncryptionTime: 0ns
- TotalIntegrityCheckTime: 0ns
- TotalReadBlockTime: 0ns
EXCHANGE_NODE (id=5):(Total: 3s100ms, non-child: 119.314us, % non-child: 0.00%)
BytesReceived(500.0ms): 0, 0, 0, 0, 0, 0, 0
- BytesReceived: 134.00 B (134)
- ConvertRowBatchTime: 0ns
- DeserializeRowBatchTimer: 31.363us
- FirstBatchArrivalWaitTime: 3s100ms
- MergeGetNext: 51.896us
- MergeGetNextBatch: 45.734us
- PeakMemoryUsage: 0
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Averaged Fragment F01:(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%)
split sizes: min: 0, max: 0, avg: 0, stddev: 0
completion times: min:3s334ms max:3s367ms mean: 3s350ms stddev:15.818906.818.905508ms
execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec stddev:0.00 /sec
num instances: 4
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 8.26 MB (8665088)
- PerHostPeakMemUsage: 8.29 MB (8689408)
- PrepareTime: 255.347ms
- RowsProduced: 1 (1)
- TotalCpuTime: 257.307ms
- TotalNetworkReceiveTime: 3s347ms
- TotalNetworkSendTime: 314.714us
- TotalStorageWaitTime: 0ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 96 (96)
- BlocksRecycled: 1 (1)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 38.40 GB (41231687680)
- PeakMemoryUsage: 8.01 MB (8399530)
- TotalBufferWaitTime: 0ns
- TotalEncryptionTime: 0ns
- TotalIntegrityCheckTime: 0ns
- TotalReadBlockTime: 0ns
CodeGen:(Total: 254.686ms, non-child: 254.686ms, % non-child: 100.00%)
- CodegenTime: 478.430us
- CompileTime: 323ns
- LoadTime: 117.38us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 252.960ms
DataStreamSender (dst_id=5):(Total: 40.576us, non-child: 40.576us, % non-child: 100.00%)
- BytesSent: 33.00 B (33)
- NetworkThroughput(*): 111.39 KB/sec
- OverallThroughput: 713.32 KB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- SerializeBatchTime: 14.713us
- ThriftTransmitTime(*): 233.563us
- UncompressedRowBatchSize: 48.00 B (48)
SORT_NODE (id=2):(Total: 3s603ms, non-child: 435.86us, % non-child: 0.01%)
- InMemorySortTime: 2.144us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.01 MB (8402944)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SortDataSize: 40.00 B (40)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 255.672ms, % non-child: 7.10%)
- BuildTime: 21.426us
- GetNewBlockTime: 55.516us
- GetResultsTime: 2.938us
- HashBuckets: 1.28K (1280)
- LargestPartitionPercent: 12 (12)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.14 MB (3297280)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 870ns
EXCHANGE_NODE (id=3):(Total: 3s347ms, non-child: 3s347ms, % non-child: 100.00%)
- BytesReceived: 134.00 B (134)
- ConvertRowBatchTime: 8.18us
- DeserializeRowBatchTimer: 31.103us
- FirstBatchArrivalWaitTime: 2s437ms
- PeakMemoryUsage: 0
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Averaged Fragment F00:(Total: 2s857ms, non-child: 0ns, % non-child: 0.00%)
split sizes: min: 1.27 GB, max: 1.92 GB, avg: 1.72 GB, stddev: 272.90 MB
completion times: min:1s886ms max:3s146ms mean: 2s675ms stddev:477.663985.663.984856ms
execution rates: min:624.97 MB/sec max:690.60 MB/sec mean:663.26 MB/sec stddev:23.89 MB/sec
num instances: 4
- AverageThreadTokens: 7.44
- PeakMemoryUsage: 4.93 MB (5166592)
- PerHostPeakMemUsage: 8.07 MB (8457728)
- PrepareTime: 185.387ms
- RowsProduced: 5 (5)
- TotalCpuTime: 22s474ms
- TotalNetworkReceiveTime: 0ns
- TotalNetworkSendTime: 1.299ms
- TotalStorageWaitTime: 87.376ms
CodeGen:(Total: 184.807ms, non-child: 184.807ms, % non-child: 100.00%)
- CodegenTime: 290.866us
- CompileTime: 233ns
- LoadTime: 45.102us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 183.547ms
DataStreamSender (dst_id=3):(Total: 24.337us, non-child: 24.337us, % non-child: 100.00%)
- BytesSent: 134.00 B (134)
- NetworkThroughput(*): 120.91 KB/sec
- OverallThroughput: 5.27 MB/sec
- PeakMemoryUsage: 40.00 KB (40960)
- SerializeBatchTime: 23.984us
- ThriftTransmitTime(*): 1.114ms
- UncompressedRowBatchSize: 192.00 B (192)
AGGREGATION_NODE (id=1):(Total: 2s856ms, non-child: 2s803ms, % non-child: 98.12%)
- BuildTime: 2s601ms
- GetNewBlockTime: 44.889us
- GetResultsTime: 4.697us
- HashBuckets: 4.10K (4096)
- LargestPartitionPercent: 0 (0)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.17 MB (3325952)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- SpilledPartitions: 0 (0)
- UnpinTime: 792ns
HDFS_SCAN_NODE (id=0):(Total: 53.678ms, non-child: 53.678ms, % non-child: 100.00%)
- AverageHdfsReadThreadConcurrency: 0.00
- AverageScannerThreadConcurrency: 6.65
- BytesRead: 777.91 KB (796579)
- BytesReadDataNodeCache: 0
- BytesReadLocal: 777.91 KB (796579)
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 777.91 KB (796579)
- DecompressionTime: 136.633us
- MaxCompressedTextFileLength: 0
- NumColumns: 1 (1)
- NumDisksAccessed: 1 (1)
- NumScannerThreadsStarted: 7 (7)
- PeakMemoryUsage: 1.74 MB (1824256)
- PerReadThreadRawHdfsThroughput: 748.79 MB/sec
- RemoteScanRanges: 0 (0)
- RowsRead: 25.87M (25874684)
- RowsReturned: 25.87M (25874684)
- RowsReturnedRate: 479.03 M/sec
- ScanRangesComplete: 7 (7)
- ScannerThreadsInvoluntaryContextSwitches: 10 (10)
- ScannerThreadsTotalWallClockTime: 19s703ms
- MaterializeTupleTime(*): 19s566ms
- ScannerThreadsSysTime: 24.0ms
- ScannerThreadsUserTime: 1s204ms
- ScannerThreadsVoluntaryContextSwitches: 25.27K (25265)
- TotalRawHdfsReadTime(*): 1.19ms
- TotalReadThroughput: 274.84 KB/sec
Fragment F01:
Instance a7442ed1a43b8e8d:8ad38fb17f29baa5 (host=worker4.some.domain.net:22000):(Total: 3s605ms, non-child: 0ns, % non-child: 0.00%)
MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.15 MB
ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 8.41 MB (8818688)
- PerHostPeakMemUsage: 8.41 MB (8818688)
- PrepareTime: 271.989ms
- RowsProduced: 2 (2)
- TotalCpuTime: 273.763ms
- TotalNetworkReceiveTime: 3s332ms
- TotalNetworkSendTime: 359.786us
- TotalStorageWaitTime: 0ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 96 (96)
- BlocksRecycled: 1 (1)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 38.40 GB (41231687680)
- PeakMemoryUsage: 8.02 MB (8404992)
- TotalBufferWaitTime: 0ns
- TotalEncryptionTime: 0ns
- TotalIntegrityCheckTime: 0ns
- TotalReadBlockTime: 0ns
CodeGen:(Total: 271.331ms, non-child: 271.331ms, % non-child: 100.00%)
- CodegenTime: 437.945us
- CompileTime: 267ns
- LoadTime: 132.3us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 269.916ms
DataStreamSender (dst_id=5):(Total: 41.665us, non-child: 41.665us, % non-child: 100.00%)
- BytesSent: 53.00 B (53)
- NetworkThroughput(*): 214.21 KB/sec
- OverallThroughput: 1.21 MB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- SerializeBatchTime: 16.673us
- ThriftTransmitTime(*): 241.624us
- UncompressedRowBatchSize: 76.00 B (76)
SORT_NODE (id=2):(Total: 3s604ms, non-child: 313.224us, % non-child: 0.01%)
- InMemorySortTime: 2.746us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.02 MB (8404992)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SortDataSize: 64.00 B (64)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 3s604ms, non-child: 272.347ms, % non-child: 7.56%)
- BuildTime: 28.572us
- GetNewBlockTime: 86.550us
- GetResultsTime: 5.612us
- HashBuckets: 2.05K (2048)
- LargestPartitionPercent: 12 (12)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.15 MB (3301376)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 827ns
EXCHANGE_NODE (id=3):(Total: 3s332ms, non-child: 69.672us, % non-child: 0.00%)
BytesReceived(500.0ms): 0, 0, 0, 0, 52.00 B, 52.00 B
- BytesReceived: 208.00 B (208)
- ConvertRowBatchTime: 8.324us
- DeserializeRowBatchTimer: 44.682us
- FirstBatchArrivalWaitTime: 2s117ms
- PeakMemoryUsage: 0
- RowsReturned: 8 (8)
- RowsReturnedRate: 2.00 /sec
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Instance a7442ed1a43b8e8d:8ad38fb17f29baa3 (host=worker2.some.domain.net:22000):(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%)
MemoryUsage(500.0ms): 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.16 MB
ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 8.21 MB (8613888)
- PerHostPeakMemUsage: 8.23 MB (8626176)
- PrepareTime: 238.304ms
- RowsProduced: 1 (1)
- TotalCpuTime: 240.469ms
- TotalNetworkReceiveTime: 3s364ms
- TotalNetworkSendTime: 332.998us
- TotalStorageWaitTime: 0ns
CodeGen:(Total: 237.499ms, non-child: 237.499ms, % non-child: 100.00%)
- CodegenTime: 621.766us
- CompileTime: 524ns
- LoadTime: 61.381us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 234.905ms
DataStreamSender (dst_id=5):(Total: 62.686us, non-child: 62.686us, % non-child: 100.00%)
- BytesSent: 31.00 B (31)
- NetworkThroughput(*): 121.00 KB/sec
- OverallThroughput: 482.94 KB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- SerializeBatchTime: 25.91us
- ThriftTransmitTime(*): 250.198us
- UncompressedRowBatchSize: 40.00 B (40)
SORT_NODE (id=2):(Total: 3s604ms, non-child: 588.641us, % non-child: 0.02%)
- InMemorySortTime: 1.479us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.02 MB (8404992)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SortDataSize: 32.00 B (32)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 238.727ms, % non-child: 6.62%)
- BuildTime: 30.821us
- GetNewBlockTime: 57.183us
- GetResultsTime: 3.80us
- HashBuckets: 1.02K (1024)
- LargestPartitionPercent: 25 (25)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.15 MB (3301376)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 1.96us
EXCHANGE_NODE (id=3):(Total: 3s364ms, non-child: 87.60us, % non-child: 0.00%)
BytesReceived(500.0ms): 0, 0, 0, 0, 0, 30.00 B, 90.00 B
- BytesReceived: 120.00 B (120)
- ConvertRowBatchTime: 11.349us
- DeserializeRowBatchTimer: 44.852us
- FirstBatchArrivalWaitTime: 2s150ms
- PeakMemoryUsage: 0
- RowsReturned: 4 (4)
- RowsReturnedRate: 1.00 /sec
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Instance a7442ed1a43b8e8d:8ad38fb17f29baa4 (host=worker1.some.domain.net:22000):(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%)
MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB
ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 8.02 MB (8409088)
- PerHostPeakMemUsage: 8.10 MB (8494080)
- PrepareTime: 240.672ms
- RowsProduced: 0 (0)
- TotalCpuTime: 242.522ms
- TotalNetworkReceiveTime: 3s362ms
- TotalNetworkSendTime: 230ns
- TotalStorageWaitTime: 0ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 96 (96)
- BlocksRecycled: 1 (1)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 38.40 GB (41231687680)
- PeakMemoryUsage: 8.00 MB (8388608)
- TotalBufferWaitTime: 0ns
- TotalEncryptionTime: 0ns
- TotalIntegrityCheckTime: 0ns
- TotalReadBlockTime: 0ns
CodeGen:(Total: 240.75ms, non-child: 240.75ms, % non-child: 100.00%)
- CodegenTime: 430.880us
- CompileTime: 253ns
- LoadTime: 153.349us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 238.587ms
DataStreamSender (dst_id=5):(Total: 14.671us, non-child: 14.671us, % non-child: 100.00%)
- BytesSent: 0
- NetworkThroughput(*): 0.00 /sec
- OverallThroughput: 0.00 /sec
- PeakMemoryUsage: 4.00 KB (4096)
- SerializeBatchTime: 0ns
- ThriftTransmitTime(*): 0ns
- UncompressedRowBatchSize: 0
SORT_NODE (id=2):(Total: 3s603ms, non-child: 364.504us, % non-child: 0.01%)
- InMemorySortTime: 1.118us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.01 MB (8396800)
- RowsReturned: 0 (0)
- RowsReturnedRate: 0
- SortDataSize: 0
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 240.841ms, % non-child: 6.68%)
- BuildTime: 1.836us
- GetNewBlockTime: 39.494us
- GetResultsTime: 0ns
- HashBuckets: 0 (0)
- LargestPartitionPercent: 0 (0)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.13 MB (3284992)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 0 (0)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 686ns
EXCHANGE_NODE (id=3):(Total: 3s362ms, non-child: 5.944us, % non-child: 0.00%)
BytesReceived(500.0ms): 0, 0, 0, 0, 0, 0, 0
- BytesReceived: 0
- ConvertRowBatchTime: 817ns
- DeserializeRowBatchTimer: 0ns
- FirstBatchArrivalWaitTime: 3s362ms
- PeakMemoryUsage: 0
- RowsReturned: 0 (0)
- RowsReturnedRate: 0
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Instance a7442ed1a43b8e8d:8ad38fb17f29baa2 (host=worker3.some.domain.net:22000):(Total: 3s603ms, non-child: 0ns, % non-child: 0.00%)
MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.15 MB, 3.16 MB
ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- PeakMemoryUsage: 8.41 MB (8818688)
- PerHostPeakMemUsage: 8.41 MB (8818688)
- PrepareTime: 270.423ms
- RowsProduced: 2 (2)
- TotalCpuTime: 272.476ms
- TotalNetworkReceiveTime: 3s331ms
- TotalNetworkSendTime: 565.845us
- TotalStorageWaitTime: 0ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 96 (96)
- BlocksRecycled: 1 (1)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 38.40 GB (41231687680)
- PeakMemoryUsage: 8.02 MB (8404992)
- TotalBufferWaitTime: 0ns
- TotalEncryptionTime: 0ns
- TotalIntegrityCheckTime: 0ns
- TotalReadBlockTime: 0ns
CodeGen:(Total: 269.840ms, non-child: 269.840ms, % non-child: 100.00%)
- CodegenTime: 423.130us
- CompileTime: 250ns
- LoadTime: 121.419us
- ModuleFileSize: 2.23 MB (2333204)
- OptimizationTime: 0ns
- PrepareTime: 268.433ms
DataStreamSender (dst_id=5):(Total: 43.283us, non-child: 43.283us, % non-child: 100.00%)
- BytesSent: 50.00 B (50)
- NetworkThroughput(*): 110.36 KB/sec
- OverallThroughput: 1.10 MB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- SerializeBatchTime: 17.90us
- ThriftTransmitTime(*): 442.430us
- UncompressedRowBatchSize: 76.00 B (76)
SORT_NODE (id=2):(Total: 3s602ms, non-child: 473.974us, % non-child: 0.01%)
- InMemorySortTime: 3.235us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.02 MB (8404992)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SortDataSize: 64.00 B (64)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 3s602ms, non-child: 270.772ms, % non-child: 7.52%)
- BuildTime: 24.478us
- GetNewBlockTime: 38.840us
- GetResultsTime: 3.61us
- HashBuckets: 2.05K (2048)
- LargestPartitionPercent: 12 (12)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 3.15 MB (3301376)
- PinTime: 0ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 874ns
EXCHANGE_NODE (id=3):(Total: 3s331ms, non-child: 85.894us, % non-child: 0.00%)
BytesReceived(500.0ms): 0, 0, 0, 0, 52.00 B, 52.00 B, 156.00 B
- BytesReceived: 208.00 B (208)
- ConvertRowBatchTime: 11.584us
- DeserializeRowBatchTimer: 34.881us
- FirstBatchArrivalWaitTime: 2s117ms
- PeakMemoryUsage: 0
- RowsReturned: 8 (8)
- RowsReturnedRate: 2.00 /sec
- SendersBlockedTimer: 0ns
- SendersBlockedTotalTimer(*): 0ns
Fragment F00:
Instance a7442ed1a43b8e8d:8ad38fb17f29baa8 (host=worker1.some.domain.net:22000):(Total: 3s327ms, non-child: 0ns, % non-child: 0.00%)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:8/1.92 GB
MemoryUsage(500.0ms): 4.73 MB, 4.76 MB, 4.77 MB, 4.76 MB, 4.76 MB, 4.73 MB, 4.23 MBThe new cluster:
+---------------------+----------+
| part_col | count(*) |
+---------------------+----------+
| 2017-01-01 00:00:00 | 18428899 |
| 2017-01-02 00:00:00 | 20974112 |
| 2017-01-03 00:00:00 | 13178213 |
| 2017-01-04 00:00:00 | 26768773 |
| 2017-01-05 00:00:00 | 24148741 |
+---------------------+----------+
Query Runtime Profile:
Query (id=814dc21ba8355459:ac7e36724f454893):
Summary:
Session ID: 694ced478c51b5e5:321e569a7a0513b5
Session Type: BEESWAX
Start Time: 2017-01-31 09:08:07.217250000
End Time: 2017-01-31 09:08:12.161790000
Query Type: QUERY
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0)
User: test_user@SOME.DOMAIN.NET
Connected User: test_user@SOME.DOMAIN.NET
Delegated User:
Network Address: ::ffff:xx.xx.xx.xx:55766
Default Db: default
Sql Statement: select part_col, count(*) from perf_test group by part_col order by 1
Coordinator: worker3.some.domain.net:22000
Query Options (non default):
Plan:
----------------
Estimated Per-Host Requirements: Memory=344.00MB VCores=2
WARNING: The following tables are missing relevant table and/or column statistics.
default.perf_test
05:MERGING-EXCHANGE [UNPARTITIONED]
| order by: part_col ASC
| hosts=5 per-host-mem=unavailable
| tuple-ids=2 row-size=24B cardinality=unavailable
|
02:SORT
| order by: part_col ASC
| hosts=5 per-host-mem=0B
| tuple-ids=2 row-size=24B cardinality=unavailable
|
04:AGGREGATE [FINALIZE]
| output: count:merge(*)
| group by: part_col
| hosts=5 per-host-mem=128.00MB
| tuple-ids=1 row-size=24B cardinality=unavailable
|
03:EXCHANGE [HASH(part_col)]
| hosts=5 per-host-mem=0B
| tuple-ids=1 row-size=24B cardinality=unavailable
|
01:AGGREGATE [STREAMING]
| output: count(*)
| group by: part_col
| hosts=5 per-host-mem=128.00MB
| tuple-ids=1 row-size=24B cardinality=unavailable
|
00:SCAN HDFS [default.perf_test, RANDOM]
partitions=1/1 files=32 size=7.22GB
table stats: unavailable
column stats: unavailable
hosts=5 per-host-mem=88.00MB
tuple-ids=0 row-size=16B cardinality=unavailable
----------------
Estimated Per-Host Mem: 360710144
Estimated Per-Host VCores: 2
Tables Missing Stats: default.perf_test
Request Pool: root.test_user
Admission result: Admitted immediately
ExecSummary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
---------------------------------------------------------------------------------------------------------------------
05:MERGING-EXCHANGE 1 82.553us 82.553us 5 -1 0 -1.00 B UNPARTITIONED
02:SORT 5 434.217us 491.025us 5 -1 8.02 MB 0
04:AGGREGATE 5 279.461ms 316.351ms 5 -1 2.30 MB 128.00 MB FINALIZE
03:EXCHANGE 5 28.378us 55.658us 25 -1 0 0 HASH(part_col)
01:AGGREGATE 5 3s965ms 4s739ms 25 -1 1.59 MB 128.00 MB STREAMING
00:SCAN HDFS 5 83.393ms 118.345ms 103.50M -1 12.04 MB 88.00 MB default.perf_test
Planner Timeline: 23.087ms
- Analysis finished: 10.147ms (10.147ms)
- Equivalence classes computed: 10.654ms (506.930us)
- Single node plan created: 14.745ms (4.091ms)
- Runtime filters computed: 14.787ms (41.721us)
- Distributed plan created: 17.268ms (2.481ms)
- Lineage info computed: 18.728ms (1.460ms)
- Planning finished: 23.087ms (4.358ms)
Query Timeline: 4s949ms
- Start execution: 56.064us (56.064us)
- Planning finished: 26.811ms (26.755ms)
- Submit for admission: 27.851ms (1.039ms)
- Completed admission: 27.929ms (77.878us)
- Ready to start 10 remote fragments: 28.364ms (434.946us)
- All 10 remote fragments started: 29.638ms (1.273ms)
- Rows available: 4s860ms (4s831ms)
- First row fetched: 4s942ms (81.424ms)
- Unregister query: 4s944ms (2.126ms)
ImpalaServer:
- ClientFetchWaitTimer: 82.476ms
- RowMaterializationTimer: 51.064us
Execution Profile 814dc21ba8355459:ac7e36724f454893:(Total: 4s833ms, non-child: 0.000ns, % non-child: 0.00%)
Number of filters: 0
Filter routing table:
ID Src. Node Tgt. Node(s) Targets Target type Partition filter Pending (Expected) First arrived Completed
-------------------------------------------------------------------------------------------------------------------
Fragment start latencies: Count: 10, 25th %-ile: 1ms, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 1ms
Per Node Peak Memory Usage: worker2.some.domain.net:22000(23.93 MB) worker5.some.domain.net:22000(23.32 MB) worker1.some.domain.net:22000(23.85 MB) worker4.some.domain.net:22000(23.94 MB) worker3.some.domain.net:22000(23.86 MB)
- FiltersReceived: 0 (0)
- FinalizationTimer: 0.000ns
Coordinator Fragment F02:(Total: 4s831ms, non-child: 255.854us, % non-child: 0.01%)
MemoryUsage(500.000ms): 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB
- AverageThreadTokens: 0.00
- BloomFilterBytes: 0
- PeakMemoryUsage: 20.04 KB (20520)
- PerHostPeakMemUsage: 0
- PrepareTime: 45.494us
- RowsProduced: 5 (5)
- TotalCpuTime: 4s914ms
- TotalNetworkReceiveTime: 0.000ns
- TotalNetworkSendTime: 0.000ns
- TotalStorageWaitTime: 0.000ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 49 (49)
- BlocksRecycled: 0 (0)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 61.98 GB (66552700928)
- PeakMemoryUsage: 8.77 MB (9191424)
- TotalBufferWaitTime: 0.000ns
- TotalEncryptionTime: 0.000ns
- TotalIntegrityCheckTime: 0.000ns
- TotalReadBlockTime: 0.000ns
EXCHANGE_NODE (id=5):(Total: 4s831ms, non-child: 82.553us, % non-child: 0.00%)
BytesReceived(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesReceived: 134.00 B (134)
- ConvertRowBatchTime: 0.000ns
- DeserializeRowBatchTimer: 31.107us
- FirstBatchArrivalWaitTime: 4s830ms
- MergeGetNext: 25.594us
- MergeGetNextBatch: 15.618us
- PeakMemoryUsage: 0
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- SendersBlockedTimer: 0.000ns
- SendersBlockedTotalTimer(*): 0.000ns
Averaged Fragment F01:(Total: 4s721ms, non-child: 0.000ns, % non-child: 0.00%)
split sizes: min: 0, max: 0, avg: 0, stddev: 0
completion times: min:4s831ms max:4s832ms mean: 4s832ms stddev:302.637us
execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec stddev:0.00 /sec
num instances: 5
- AverageThreadTokens: 1.00
- BloomFilterBytes: 0
- PeakMemoryUsage: 10.30 MB (10802320)
- PerHostPeakMemUsage: 23.78 MB (24934251)
- PrepareTime: 278.764ms
- RowsProduced: 1 (1)
- TotalCpuTime: 390.687ms
- TotalNetworkReceiveTime: 4s440ms
- TotalNetworkSendTime: 107.306us
- TotalStorageWaitTime: 0.000ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 49 (49)
- BlocksRecycled: 0 (0)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 61.98 GB (66552700928)
- PeakMemoryUsage: 8.77 MB (9191424)
- TotalBufferWaitTime: 0.000ns
- TotalEncryptionTime: 0.000ns
- TotalIntegrityCheckTime: 0.000ns
- TotalReadBlockTime: 0.000ns
CodeGen:(Total: 381.581ms, non-child: 381.581ms, % non-child: 100.00%)
- CodegenTime: 6.404ms
- CompileTime: 7.476ms
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.15 MB (2254856)
- NumFunctions: 29 (29)
- NumInstructions: 550 (550)
- OptimizationTime: 102.403ms
- PrepareTime: 270.719ms
DataStreamSender (dst_id=5):(Total: 68.002us, non-child: 68.002us, % non-child: 100.00%)
- BytesSent: 26.00 B (26)
- NetworkThroughput(*): 787.80 KB/sec
- OverallThroughput: 295.48 KB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- RowsReturned: 1 (1)
- SerializeBatchTime: 19.405us
- TransmitDataRPCTime: 21.326us
- UncompressedRowBatchSize: 38.00 B (38)
SORT_NODE (id=2):(Total: 4s720ms, non-child: 434.218us, % non-child: 0.01%)
- InMemorySortTime: 1.484us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.01 MB (8401715)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SortDataSize: 32.00 B (32)
- SpilledRuns: 0 (0)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 4s720ms, non-child: 279.461ms, % non-child: 5.92%)
- BuildTime: 13.858us
- GetNewBlockTime: 145.320us
- GetResultsTime: 1.887us
- HTResizeTime: 3.177us
- HashBuckets: 1.02K (1024)
- LargestPartitionPercent: 8 (8)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 2.29 MB (2401408)
- PinTime: 0.000ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 1 (1)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 1.046us
EXCHANGE_NODE (id=3):(Total: 4s440ms, non-child: 4s440ms, % non-child: 100.00%)
- BytesReceived: 133.00 B (133)
- ConvertRowBatchTime: 4.099us
- DeserializeRowBatchTimer: 27.477us
- FirstBatchArrivalWaitTime: 2s625ms
- PeakMemoryUsage: 0
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- SendersBlockedTimer: 0.000ns
- SendersBlockedTotalTimer(*): 0.000ns
Averaged Fragment F00:(Total: 4s051ms, non-child: 0.000ns, % non-child: 0.00%)
split sizes: min: 1.33 GB, max: 1.54 GB, avg: 1.44 GB, stddev: 78.16 MB
completion times: min:1s806ms max:4s831ms mean: 4s052ms stddev:1s134ms
execution rates: min:317.74 MB/sec max:756.58 MB/sec mean:412.43 MB/sec stddev:172.21 MB/sec
num instances: 5
- AverageThreadTokens: 6.25
- BloomFilterBytes: 0
- PeakMemoryUsage: 13.49 MB (14146688)
- PerHostPeakMemUsage: 23.78 MB (24934251)
- PrepareTime: 281.243ms
- RowsProduced: 5 (5)
- TotalCpuTime: 26s501ms
- TotalNetworkReceiveTime: 0.000ns
- TotalNetworkSendTime: 506.296us
- TotalStorageWaitTime: 136.659ms
CodeGen:(Total: 274.654ms, non-child: 274.654ms, % non-child: 100.00%)
- CodegenTime: 6.016ms
- CompileTime: 0.000ns
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.15 MB (2254856)
- NumFunctions: 0 (0)
- NumInstructions: 0 (0)
- OptimizationTime: 0.000ns
- PrepareTime: 273.688ms
DataStreamSender (dst_id=3):(Total: 147.360us, non-child: 147.360us, % non-child: 100.00%)
- BytesSent: 133.00 B (133)
- NetworkThroughput(*): 1.84 MB/sec
- OverallThroughput: 885.74 KB/sec
- PeakMemoryUsage: 32.00 KB (32768)
- RowsReturned: 5 (5)
- SerializeBatchTime: 48.621us
- TransmitDataRPCTime: 73.238us
- UncompressedRowBatchSize: 192.00 B (192)
AGGREGATION_NODE (id=1):(Total: 4s049ms, non-child: 3s965ms, % non-child: 97.94%)
- GetNewBlockTime: 78.103us
- GetResultsTime: 9.538us
- HTResizeTime: 523.269us
- HashBuckets: 8.19K (8192)
- LargestPartitionPercent: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 1.59 MB (1668224)
- PinTime: 0.000ns
- ReductionFactorEstimate: 0.00
- ReductionFactorThresholdToExpand: 0.00
- RowsPassedThrough: 0 (0)
- RowsReturned: 5 (5)
- RowsReturnedRate: 1.00 /sec
- StreamingTime: 3s666ms
- UnpinTime: 0.000ns
HDFS_SCAN_NODE (id=0):(Total: 83.393ms, non-child: 83.393ms, % non-child: 100.00%)
- AverageHdfsReadThreadConcurrency: 0.00
- AverageScannerThreadConcurrency: 5.63
- BytesRead: 650.78 KB (666403)
- BytesReadDataNodeCache: 0
- BytesReadLocal: 650.78 KB (666403)
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 650.78 KB (666403)
- DecompressionTime: 345.570us
- MaxCompressedTextFileLength: 0
- NumColumns: 1 (1)
- NumDisksAccessed: 6 (6)
- NumRowGroups: 6 (6)
- NumScannerThreadsStarted: 6 (6)
- PeakMemoryUsage: 11.89 MB (12465356)
- PerReadThreadRawHdfsThroughput: 686.45 MB/sec
- RemoteScanRanges: 0 (0)
- RowsRead: 20.70M (20699747)
- RowsReturned: 20.70M (20699747)
- RowsReturnedRate: 282.05 M/sec
- ScanRangesComplete: 6 (6)
- ScannerThreadsInvoluntaryContextSwitches: 12 (12)
- ScannerThreadsTotalWallClockTime: 22s586ms
- MaterializeTupleTime(*): 22s440ms
- ScannerThreadsSysTime: 32.800ms
- ScannerThreadsUserTime: 666.440ms
- ScannerThreadsVoluntaryContextSwitches: 20.07K (20074)
- TotalRawHdfsReadTime(*): 1.368ms
- TotalReadThroughput: 181.83 KB/sec
Fragment F01:
Instance 814dc21ba8355459:ac7e36724f454896 (host=worker1.some.domain.net:22000):(Total: 4s773ms, non-child: 0.000ns, % non-child: 0.00%)
MemoryUsage(500.000ms): 10.28 MB, 10.28 MB, 10.28 MB, 10.28 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.31 MB
ThreadUsage(500.000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- BloomFilterBytes: 0
- PeakMemoryUsage: 10.31 MB (10810528)
- PerHostPeakMemUsage: 23.85 MB (25013504)
- PrepareTime: 140.565ms
- RowsProduced: 2 (2)
- TotalCpuTime: 198.707ms
- TotalNetworkReceiveTime: 4s632ms
- TotalNetworkSendTime: 144.042us
- TotalStorageWaitTime: 0.000ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 49 (49)
- BlocksRecycled: 0 (0)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 61.98 GB (66552700928)
- PeakMemoryUsage: 8.77 MB (9191424)
- TotalBufferWaitTime: 0.000ns
- TotalEncryptionTime: 0.000ns
- TotalIntegrityCheckTime: 0.000ns
- TotalReadBlockTime: 0.000ns
CodeGen:(Total: 193.769ms, non-child: 193.769ms, % non-child: 100.00%)
- CodegenTime: 3.318ms
- CompileTime: 3.802ms
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.15 MB (2254856)
- NumFunctions: 29 (29)
- NumInstructions: 550 (550)
- OptimizationTime: 53.125ms
- PrepareTime: 136.382ms
DataStreamSender (dst_id=5):(Total: 119.588us, non-child: 119.588us, % non-child: 100.00%)
- BytesSent: 50.00 B (50)
- NetworkThroughput(*): 1.84 MB/sec
- OverallThroughput: 408.30 KB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- RowsReturned: 2 (2)
- SerializeBatchTime: 73.063us
- TransmitDataRPCTime: 25.979us
- UncompressedRowBatchSize: 76.00 B (76)
SORT_NODE (id=2):(Total: 4s773ms, non-child: 271.839us, % non-child: 0.01%)
ExecOption: Codegen Enabled
- InMemorySortTime: 1.324us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.02 MB (8404992)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SortDataSize: 64.00 B (64)
- SpilledRuns: 0 (0)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 4s773ms, non-child: 140.842ms, % non-child: 2.95%)
ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI
- BuildTime: 12.344us
- GetNewBlockTime: 87.985us
- GetResultsTime: 1.797us
- HTResizeTime: 1.646us
- HashBuckets: 2.05K (2048)
- LargestPartitionPercent: 10 (10)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 2.30 MB (2409600)
- PinTime: 0.000ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 616.000ns
EXCHANGE_NODE (id=3):(Total: 4s632ms, non-child: 28.353us, % non-child: 0.00%)
BytesReceived(500.000ms): 0, 0, 0, 0, 52.00 B, 52.00 B, 52.00 B, 52.00 B, 52.00 B, 208.00 B
- BytesReceived: 260.00 B (260)
- ConvertRowBatchTime: 4.176us
- DeserializeRowBatchTimer: 25.676us
- FirstBatchArrivalWaitTime: 1s607ms
- PeakMemoryUsage: 0
- RowsReturned: 10 (10)
- RowsReturnedRate: 2.00 /sec
- SendersBlockedTimer: 0.000ns
- SendersBlockedTotalTimer(*): 0.000ns
Instance 814dc21ba8355459:ac7e36724f454897 (host=worker5.some.domain.net:22000):(Total: 4s709ms, non-child: 0.000ns, % non-child: 0.00%)
MemoryUsage(500.000ms): 0, 10.28 MB, 10.28 MB, 10.28 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.30 MB
ThreadUsage(500.000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
- AverageThreadTokens: 1.00
- BloomFilterBytes: 0
- PeakMemoryUsage: 10.31 MB (10810528)
- PerHostPeakMemUsage: 23.32 MB (24449296)
- PrepareTime: 311.018ms
- RowsProduced: 2 (2)
- TotalCpuTime: 435.142ms
- TotalNetworkReceiveTime: 4s396ms
- TotalNetworkSendTime: 187.827us
- TotalStorageWaitTime: 0.000ns
BlockMgr:
- BlockWritesOutstanding: 0 (0)
- BlocksCreated: 49 (49)
- BlocksRecycled: 0 (0)
- BufferedPins: 0 (0)
- BytesWritten: 0
- MaxBlockSize: 8.00 MB (8388608)
- MemoryLimit: 61.98 GB (66552700928)
- PeakMemoryUsage: 8.77 MB (9191424)
- TotalBufferWaitTime: 0.000ns
- TotalEncryptionTime: 0.000ns
- TotalIntegrityCheckTime: 0.000ns
- TotalReadBlockTime: 0.000ns
CodeGen:(Total: 425.412ms, non-child: 425.412ms, % non-child: 100.00%)
- CodegenTime: 6.872ms
- CompileTime: 8.123ms
- LoadTime: 0.000ns
- ModuleBitcodeSize: 2.15 MB (2254856)
- NumFunctions: 29 (29)
- NumInstructions: 550 (550)
- OptimizationTime: 113.870ms
- PrepareTime: 302.324ms
DataStreamSender (dst_id=5):(Total: 69.893us, non-child: 69.893us, % non-child: 100.00%)
- BytesSent: 53.00 B (53)
- NetworkThroughput(*): 1.31 MB/sec
- OverallThroughput: 740.53 KB/sec
- PeakMemoryUsage: 4.00 KB (4096)
- RowsReturned: 2 (2)
- SerializeBatchTime: 11.380us
- TransmitDataRPCTime: 38.673us
- UncompressedRowBatchSize: 76.00 B (76)
SORT_NODE (id=2):(Total: 4s708ms, non-child: 491.025us, % non-child: 0.01%)
ExecOption: Codegen Enabled
- InMemorySortTime: 2.603us
- InitialRunsCreated: 1 (1)
- PeakMemoryUsage: 8.02 MB (8404992)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SortDataSize: 64.00 B (64)
- SpilledRuns: 0 (0)
- TotalMergesPerformed: 0 (0)
AGGREGATION_NODE (id=4):(Total: 4s708ms, non-child: 311.656ms, % non-child: 6.62%)
ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI
- BuildTime: 21.607us
- GetNewBlockTime: 144.739us
- GetResultsTime: 5.404us
- HTResizeTime: 3.560us
- HashBuckets: 2.05K (2048)
- LargestPartitionPercent: 10 (10)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 2.30 MB (2409600)
- PinTime: 0.000ns
- RowsRepartitioned: 0 (0)
- RowsReturned: 2 (2)
- RowsReturnedRate: 0
- SpilledPartitions: 0 (0)
- UnpinTime: 1.295us
EXCHANGE_NODE (id=3):(Total: 4s396ms, non-child: 42.905us, % non-child: 0.00%)
BytesReceived(500.000ms): 0, 0, 0, 51.00 B, 51.00 B, 51.00 B, 51.00 B, 51.00 B, 154.00 B
- BytesReceived: 258.00 B (258)
- ConvertRowBatchTime: 6.460us
- DeserializeRowBatchTimer: 67.686us
- FirstBatchArrivalWaitTime: 1s371ms
- PeakMemoryUsage: 0
- RowsReturned: 10 (10)
- RowsReturnedRate: 2.00 /sec
- SendersBlockedTimer: 0.000ns
- SendersBlockedTotalTimer(*): 0.000ns
Created 02-03-2017 03:04 PM
It's not entirely obvious to me what changed. Since it's grouping by timestamp, you're running into https://issues.cloudera.org/browse/IMPALA-3884, which is forcing it to execute the aggregation in an interpreted mode. I suspect that the performance of that interpreted code path may have changed a bit - we tend to focus most of our tuning on the codegened code path.
On releases with IMPALA-3884 fixed it should be a *lot* faster.
" ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI"
Created 02-08-2017 05:26 AM
It looks like the problem is really in the timestamp field. Running a similar query on table without timestamp show much better results on the new environment. Thanks for the help