Created on 02-02-2017 05:21 AM - edited 09-16-2022 04:00 AM
Hi,
I have an issue after upgrade of cluster and migrate to a newer hardware. The same query "Select part_col, count(*) from table group by part_col" runs significantly slower on the new environment. In this case there are two clusters, one 4 nodes (CDH 5.4 Impala 2.2), one with 5 nodes (CDH5.8 Impala 2.6 kerberized). The worker nodes are newer and has more data disks, the processor is the same Xeon E5-2640 v3 vs E5-2650 v2.
The connectivity is 1GBit (old) vs 10Gbit(new).
So the old cluster has less nodes, less disks per node (5 vs 16), less RAM, but still manages to execute the query faster than the new cluster. I digged into the profile of the queries and found out that the step AGGREGATE takes much longer and consumes all the exec time. And in the new version it has a tag "STREAMING". The HDFS scan is in this case more or less the same, but on larger table much faster on the new cluster. So I assume this is not an IO issue.
I assume that during AGGREGATE the data is already in RAM and hashes are computed by CPU..
Does it mean that the new cluster has less powerfull CPU?
Or what other factors can affect the AGGREGATE (STREAMING) duration?
Is there any other tools or papers describing how to tune Impala in these situations?
Thanks for any advice
OLD CLUSTER - Impala 2.2 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 119.314us 119.314us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 4 435.85us 588.641us 5 -1 8.02 MB 0 04:AGGREGATE 4 255.672ms 272.347ms 5 -1 3.15 MB 128.00 MB FINALIZE 03:EXCHANGE 4 62.142us 87.60us 20 -1 0 0 HASH(part_col) 01:AGGREGATE 4 2s803ms 3s270ms 20 -1 3.17 MB 128.00 MB 00:SCAN HDFS 4 53.678ms 56.787ms 103.50M -1 1.82 MB 88.00 MB default.table NEW CLUSTER - Impala 2.6 Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 82.553us 82.553us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 5 434.217us 491.025us 5 -1 8.02 MB 0 04:AGGREGATE 5 279.461ms 316.351ms 5 -1 2.30 MB 128.00 MB FINALIZE 03:EXCHANGE 5 28.378us 55.658us 25 -1 0 0 HASH(part_col) 01:AGGREGATE 5 3s965ms 4s739ms 25 -1 1.59 MB 128.00 MB STREAMING 00:SCAN HDFS 5 83.393ms 118.345ms 103.50M -1 12.04 MB 88.00 MB default.table
Created 02-03-2017 03:04 PM
It's not entirely obvious to me what changed. Since it's grouping by timestamp, you're running into https://issues.cloudera.org/browse/IMPALA-3884, which is forcing it to execute the aggregation in an interpreted mode. I suspect that the performance of that interpreted code path may have changed a bit - we tend to focus most of our tuning on the codegened code path.
On releases with IMPALA-3884 fixed it should be a *lot* faster.
" ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI"
Created 02-02-2017 10:04 AM
Can you share the explain plan and the query? I'm interested to know what data types and aggregate functions you're using.
We're not aware of any performance regressions between those versions - we actually made a number of improvements, e.g. changing the hash table implementation and adding software prefetching. It's possible that your workload hit one of the edge cases where these changes had a detrimental effect.
Created 02-02-2017 10:08 AM
If you want to experiment you could try setting the query options "PREFETCH_MODE=0" and "DISABLE_STREAMING_PREAGGREGATIONS=1". You could also try adjusting NUM_SCANNER_THREADS.
With the last option, we changed our recommendation recently to set it to the # of logical cores on the machine (it was 3x logical cores originally - we changed the default in a later release). You can set the default via the "default query options" impalad startup option.
Created 02-03-2017 01:24 AM
I tried to disable prefetch and streaming, but did not helped.
Here is the profile of old Impala on old cluster:
+---------------------+----------+ | part_col | count(*) | +---------------------+----------+ | 2017-01-01 00:00:00 | 18428899 | | 2017-01-02 00:00:00 | 20974112 | | 2017-01-03 00:00:00 | 13178213 | | 2017-01-04 00:00:00 | 26768773 | | 2017-01-05 00:00:00 | 24148741 | +---------------------+----------+ Query Runtime Profile: Query (id=a7442ed1a43b8e8d:8ad38fb17f29baa0): Summary: Session ID: f94cc4d4c075f287:175ee4069895a6b0 Session Type: BEESWAX Start Time: 2017-01-31 09:07:47.514559000 End Time: 2017-01-31 09:07:51.159487000 Query Type: QUERY Query State: FINISHED Query Status: OK Impala Version: impalad version 2.2.0-cdh5.4.8 RELEASE (build 137d99e9f751c454a0c79f3b00302938e4984f9c) User: hadoop Connected User: hadoop Delegated User: Network Address: ::ffff:xx.xx.xx.xx:33929 Default Db: default Sql Statement: select part_col, count(*) from perf_test group by part_col order by 1 Coordinator: worker2.some.domain.net:22000 Plan: ---------------- Estimated Per-Host Requirements: Memory=256.00MB VCores=2 WARNING: The following tables are missing relevant table and/or column statistics. default.perf_test F02:PLAN FRAGMENT [UNPARTITIONED] 05:MERGING-EXCHANGE [UNPARTITIONED] order by: part_col ASC hosts=4 per-host-mem=unavailable tuple-ids=2 row-size=24B cardinality=unavailable F01:PLAN FRAGMENT [HASH(part_col)] DATASTREAM SINK [FRAGMENT=F02, EXCHANGE=05, UNPARTITIONED] 02:SORT | order by: part_col ASC | hosts=4 per-host-mem=0B | tuple-ids=2 row-size=24B cardinality=unavailable | 04:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: part_col | hosts=4 per-host-mem=128.00MB | tuple-ids=1 row-size=24B cardinality=unavailable | 03:EXCHANGE [HASH(part_col)] hosts=4 per-host-mem=0B tuple-ids=1 row-size=24B cardinality=unavailable F00:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=03, HASH(part_col)] 01:AGGREGATE | output: count(*) | group by: part_col | hosts=4 per-host-mem=128.00MB | tuple-ids=1 row-size=24B cardinality=unavailable | 00:SCAN HDFS [default.perf_test, RANDOM] partitions=1/1 files=31 size=6.89GB table stats: unavailable column stats: unavailable hosts=4 per-host-mem=88.00MB tuple-ids=0 row-size=16B cardinality=unavailable ---------------- Estimated Per-Host Mem: 268435456 Estimated Per-Host VCores: 2 Tables Missing Stats: default.perf_test Admission result: Admitted immediately Request Pool: root.hadoop ExecSummary: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 119.314us 119.314us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 4 435.85us 588.641us 5 -1 8.02 MB 0 04:AGGREGATE 4 255.672ms 272.347ms 5 -1 3.15 MB 128.00 MB FINALIZE 03:EXCHANGE 4 62.142us 87.60us 20 -1 0 0 HASH(part_col) 01:AGGREGATE 4 2s803ms 3s270ms 20 -1 3.17 MB 128.00 MB 00:SCAN HDFS 4 53.678ms 56.787ms 103.50M -1 1.82 MB 88.00 MB default.perf_test Planner Timeline: 3.584ms - Analysis finished: 1.132ms (1.132ms) - Equivalence classes computed: 1.236ms (103.831us) - Single node plan created: 1.823ms (587.16us) - Distributed plan created: 2.300ms (477.253us) - Lineage info computed: 2.664ms (363.854us) - Planning finished: 3.584ms (919.713us) Query Timeline: 3s649ms - Start execution: 42.805us (42.805us) - Planning finished: 4.458ms (4.415ms) - Submit for admission: 4.775ms (317.243us) - Completed admission: 4.838ms (63.556us) - Ready to start remote fragments: 5.352ms (514.114us) - Remote fragments started: 511.206ms (505.853ms) - Rows available: 3s612ms (3s101ms) - First row fetched: 3s639ms (26.709ms) - Unregister query: 3s644ms (5.608ms) ImpalaServer: - ClientFetchWaitTimer: 30.950ms - RowMaterializationTimer: 72.117us Execution Profile a7442ed1a43b8e8d:8ad38fb17f29baa0:(Total: 3s608ms, non-child: 0ns, % non-child: 0.00%) Fragment start latencies: count: 8, last: 0.231475ns, min: 0.160659ns, max: 0.273263ns, mean: 0.221710ns, stddev: 0.041526ns Per Node Peak Memory Usage: worker4.some.domain.net:22000(8.41 MB) worker1.some.domain.net:22000(8.10 MB) worker3.some.domain.net:22000(8.41 MB) worker2.some.domain.net:22000(8.23 MB) - FinalizationTimer: 0ns Coordinator Fragment F02:(Total: 3s100ms, non-child: 251.349us, % non-child: 0.01%) MemoryUsage(500.0ms): 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB ThreadUsage(500.0ms): 1 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 24.00 KB (24576) - PerHostPeakMemUsage: 0 - PrepareTime: 49.521us - RowsProduced: 5 (5) - TotalCpuTime: 3s634ms - TotalNetworkReceiveTime: 0ns - TotalNetworkSendTime: 0ns - TotalStorageWaitTime: 0ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 96 (96) - BlocksRecycled: 1 (1) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 38.40 GB (41231687680) - PeakMemoryUsage: 8.01 MB (8396800) - TotalBufferWaitTime: 0ns - TotalEncryptionTime: 0ns - TotalIntegrityCheckTime: 0ns - TotalReadBlockTime: 0ns EXCHANGE_NODE (id=5):(Total: 3s100ms, non-child: 119.314us, % non-child: 0.00%) BytesReceived(500.0ms): 0, 0, 0, 0, 0, 0, 0 - BytesReceived: 134.00 B (134) - ConvertRowBatchTime: 0ns - DeserializeRowBatchTimer: 31.363us - FirstBatchArrivalWaitTime: 3s100ms - MergeGetNext: 51.896us - MergeGetNextBatch: 45.734us - PeakMemoryUsage: 0 - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Averaged Fragment F01:(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%) split sizes: min: 0, max: 0, avg: 0, stddev: 0 completion times: min:3s334ms max:3s367ms mean: 3s350ms stddev:15.818906.818.905508ms execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec stddev:0.00 /sec num instances: 4 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 8.26 MB (8665088) - PerHostPeakMemUsage: 8.29 MB (8689408) - PrepareTime: 255.347ms - RowsProduced: 1 (1) - TotalCpuTime: 257.307ms - TotalNetworkReceiveTime: 3s347ms - TotalNetworkSendTime: 314.714us - TotalStorageWaitTime: 0ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 96 (96) - BlocksRecycled: 1 (1) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 38.40 GB (41231687680) - PeakMemoryUsage: 8.01 MB (8399530) - TotalBufferWaitTime: 0ns - TotalEncryptionTime: 0ns - TotalIntegrityCheckTime: 0ns - TotalReadBlockTime: 0ns CodeGen:(Total: 254.686ms, non-child: 254.686ms, % non-child: 100.00%) - CodegenTime: 478.430us - CompileTime: 323ns - LoadTime: 117.38us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 252.960ms DataStreamSender (dst_id=5):(Total: 40.576us, non-child: 40.576us, % non-child: 100.00%) - BytesSent: 33.00 B (33) - NetworkThroughput(*): 111.39 KB/sec - OverallThroughput: 713.32 KB/sec - PeakMemoryUsage: 4.00 KB (4096) - SerializeBatchTime: 14.713us - ThriftTransmitTime(*): 233.563us - UncompressedRowBatchSize: 48.00 B (48) SORT_NODE (id=2):(Total: 3s603ms, non-child: 435.86us, % non-child: 0.01%) - InMemorySortTime: 2.144us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.01 MB (8402944) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SortDataSize: 40.00 B (40) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 255.672ms, % non-child: 7.10%) - BuildTime: 21.426us - GetNewBlockTime: 55.516us - GetResultsTime: 2.938us - HashBuckets: 1.28K (1280) - LargestPartitionPercent: 12 (12) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.14 MB (3297280) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 870ns EXCHANGE_NODE (id=3):(Total: 3s347ms, non-child: 3s347ms, % non-child: 100.00%) - BytesReceived: 134.00 B (134) - ConvertRowBatchTime: 8.18us - DeserializeRowBatchTimer: 31.103us - FirstBatchArrivalWaitTime: 2s437ms - PeakMemoryUsage: 0 - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Averaged Fragment F00:(Total: 2s857ms, non-child: 0ns, % non-child: 0.00%) split sizes: min: 1.27 GB, max: 1.92 GB, avg: 1.72 GB, stddev: 272.90 MB completion times: min:1s886ms max:3s146ms mean: 2s675ms stddev:477.663985.663.984856ms execution rates: min:624.97 MB/sec max:690.60 MB/sec mean:663.26 MB/sec stddev:23.89 MB/sec num instances: 4 - AverageThreadTokens: 7.44 - PeakMemoryUsage: 4.93 MB (5166592) - PerHostPeakMemUsage: 8.07 MB (8457728) - PrepareTime: 185.387ms - RowsProduced: 5 (5) - TotalCpuTime: 22s474ms - TotalNetworkReceiveTime: 0ns - TotalNetworkSendTime: 1.299ms - TotalStorageWaitTime: 87.376ms CodeGen:(Total: 184.807ms, non-child: 184.807ms, % non-child: 100.00%) - CodegenTime: 290.866us - CompileTime: 233ns - LoadTime: 45.102us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 183.547ms DataStreamSender (dst_id=3):(Total: 24.337us, non-child: 24.337us, % non-child: 100.00%) - BytesSent: 134.00 B (134) - NetworkThroughput(*): 120.91 KB/sec - OverallThroughput: 5.27 MB/sec - PeakMemoryUsage: 40.00 KB (40960) - SerializeBatchTime: 23.984us - ThriftTransmitTime(*): 1.114ms - UncompressedRowBatchSize: 192.00 B (192) AGGREGATION_NODE (id=1):(Total: 2s856ms, non-child: 2s803ms, % non-child: 98.12%) - BuildTime: 2s601ms - GetNewBlockTime: 44.889us - GetResultsTime: 4.697us - HashBuckets: 4.10K (4096) - LargestPartitionPercent: 0 (0) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.17 MB (3325952) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - SpilledPartitions: 0 (0) - UnpinTime: 792ns HDFS_SCAN_NODE (id=0):(Total: 53.678ms, non-child: 53.678ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.00 - AverageScannerThreadConcurrency: 6.65 - BytesRead: 777.91 KB (796579) - BytesReadDataNodeCache: 0 - BytesReadLocal: 777.91 KB (796579) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 777.91 KB (796579) - DecompressionTime: 136.633us - MaxCompressedTextFileLength: 0 - NumColumns: 1 (1) - NumDisksAccessed: 1 (1) - NumScannerThreadsStarted: 7 (7) - PeakMemoryUsage: 1.74 MB (1824256) - PerReadThreadRawHdfsThroughput: 748.79 MB/sec - RemoteScanRanges: 0 (0) - RowsRead: 25.87M (25874684) - RowsReturned: 25.87M (25874684) - RowsReturnedRate: 479.03 M/sec - ScanRangesComplete: 7 (7) - ScannerThreadsInvoluntaryContextSwitches: 10 (10) - ScannerThreadsTotalWallClockTime: 19s703ms - MaterializeTupleTime(*): 19s566ms - ScannerThreadsSysTime: 24.0ms - ScannerThreadsUserTime: 1s204ms - ScannerThreadsVoluntaryContextSwitches: 25.27K (25265) - TotalRawHdfsReadTime(*): 1.19ms - TotalReadThroughput: 274.84 KB/sec Fragment F01: Instance a7442ed1a43b8e8d:8ad38fb17f29baa5 (host=worker4.some.domain.net:22000):(Total: 3s605ms, non-child: 0ns, % non-child: 0.00%) MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.15 MB ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 8.41 MB (8818688) - PerHostPeakMemUsage: 8.41 MB (8818688) - PrepareTime: 271.989ms - RowsProduced: 2 (2) - TotalCpuTime: 273.763ms - TotalNetworkReceiveTime: 3s332ms - TotalNetworkSendTime: 359.786us - TotalStorageWaitTime: 0ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 96 (96) - BlocksRecycled: 1 (1) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 38.40 GB (41231687680) - PeakMemoryUsage: 8.02 MB (8404992) - TotalBufferWaitTime: 0ns - TotalEncryptionTime: 0ns - TotalIntegrityCheckTime: 0ns - TotalReadBlockTime: 0ns CodeGen:(Total: 271.331ms, non-child: 271.331ms, % non-child: 100.00%) - CodegenTime: 437.945us - CompileTime: 267ns - LoadTime: 132.3us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 269.916ms DataStreamSender (dst_id=5):(Total: 41.665us, non-child: 41.665us, % non-child: 100.00%) - BytesSent: 53.00 B (53) - NetworkThroughput(*): 214.21 KB/sec - OverallThroughput: 1.21 MB/sec - PeakMemoryUsage: 4.00 KB (4096) - SerializeBatchTime: 16.673us - ThriftTransmitTime(*): 241.624us - UncompressedRowBatchSize: 76.00 B (76) SORT_NODE (id=2):(Total: 3s604ms, non-child: 313.224us, % non-child: 0.01%) - InMemorySortTime: 2.746us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.02 MB (8404992) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SortDataSize: 64.00 B (64) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 3s604ms, non-child: 272.347ms, % non-child: 7.56%) - BuildTime: 28.572us - GetNewBlockTime: 86.550us - GetResultsTime: 5.612us - HashBuckets: 2.05K (2048) - LargestPartitionPercent: 12 (12) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.15 MB (3301376) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 827ns EXCHANGE_NODE (id=3):(Total: 3s332ms, non-child: 69.672us, % non-child: 0.00%) BytesReceived(500.0ms): 0, 0, 0, 0, 52.00 B, 52.00 B - BytesReceived: 208.00 B (208) - ConvertRowBatchTime: 8.324us - DeserializeRowBatchTimer: 44.682us - FirstBatchArrivalWaitTime: 2s117ms - PeakMemoryUsage: 0 - RowsReturned: 8 (8) - RowsReturnedRate: 2.00 /sec - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Instance a7442ed1a43b8e8d:8ad38fb17f29baa3 (host=worker2.some.domain.net:22000):(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%) MemoryUsage(500.0ms): 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.16 MB ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 8.21 MB (8613888) - PerHostPeakMemUsage: 8.23 MB (8626176) - PrepareTime: 238.304ms - RowsProduced: 1 (1) - TotalCpuTime: 240.469ms - TotalNetworkReceiveTime: 3s364ms - TotalNetworkSendTime: 332.998us - TotalStorageWaitTime: 0ns CodeGen:(Total: 237.499ms, non-child: 237.499ms, % non-child: 100.00%) - CodegenTime: 621.766us - CompileTime: 524ns - LoadTime: 61.381us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 234.905ms DataStreamSender (dst_id=5):(Total: 62.686us, non-child: 62.686us, % non-child: 100.00%) - BytesSent: 31.00 B (31) - NetworkThroughput(*): 121.00 KB/sec - OverallThroughput: 482.94 KB/sec - PeakMemoryUsage: 4.00 KB (4096) - SerializeBatchTime: 25.91us - ThriftTransmitTime(*): 250.198us - UncompressedRowBatchSize: 40.00 B (40) SORT_NODE (id=2):(Total: 3s604ms, non-child: 588.641us, % non-child: 0.02%) - InMemorySortTime: 1.479us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.02 MB (8404992) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SortDataSize: 32.00 B (32) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 238.727ms, % non-child: 6.62%) - BuildTime: 30.821us - GetNewBlockTime: 57.183us - GetResultsTime: 3.80us - HashBuckets: 1.02K (1024) - LargestPartitionPercent: 25 (25) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.15 MB (3301376) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 1.96us EXCHANGE_NODE (id=3):(Total: 3s364ms, non-child: 87.60us, % non-child: 0.00%) BytesReceived(500.0ms): 0, 0, 0, 0, 0, 30.00 B, 90.00 B - BytesReceived: 120.00 B (120) - ConvertRowBatchTime: 11.349us - DeserializeRowBatchTimer: 44.852us - FirstBatchArrivalWaitTime: 2s150ms - PeakMemoryUsage: 0 - RowsReturned: 4 (4) - RowsReturnedRate: 1.00 /sec - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Instance a7442ed1a43b8e8d:8ad38fb17f29baa4 (host=worker1.some.domain.net:22000):(Total: 3s604ms, non-child: 0ns, % non-child: 0.00%) MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 8.02 MB (8409088) - PerHostPeakMemUsage: 8.10 MB (8494080) - PrepareTime: 240.672ms - RowsProduced: 0 (0) - TotalCpuTime: 242.522ms - TotalNetworkReceiveTime: 3s362ms - TotalNetworkSendTime: 230ns - TotalStorageWaitTime: 0ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 96 (96) - BlocksRecycled: 1 (1) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 38.40 GB (41231687680) - PeakMemoryUsage: 8.00 MB (8388608) - TotalBufferWaitTime: 0ns - TotalEncryptionTime: 0ns - TotalIntegrityCheckTime: 0ns - TotalReadBlockTime: 0ns CodeGen:(Total: 240.75ms, non-child: 240.75ms, % non-child: 100.00%) - CodegenTime: 430.880us - CompileTime: 253ns - LoadTime: 153.349us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 238.587ms DataStreamSender (dst_id=5):(Total: 14.671us, non-child: 14.671us, % non-child: 100.00%) - BytesSent: 0 - NetworkThroughput(*): 0.00 /sec - OverallThroughput: 0.00 /sec - PeakMemoryUsage: 4.00 KB (4096) - SerializeBatchTime: 0ns - ThriftTransmitTime(*): 0ns - UncompressedRowBatchSize: 0 SORT_NODE (id=2):(Total: 3s603ms, non-child: 364.504us, % non-child: 0.01%) - InMemorySortTime: 1.118us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.01 MB (8396800) - RowsReturned: 0 (0) - RowsReturnedRate: 0 - SortDataSize: 0 - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 3s603ms, non-child: 240.841ms, % non-child: 6.68%) - BuildTime: 1.836us - GetNewBlockTime: 39.494us - GetResultsTime: 0ns - HashBuckets: 0 (0) - LargestPartitionPercent: 0 (0) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.13 MB (3284992) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 0 (0) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 686ns EXCHANGE_NODE (id=3):(Total: 3s362ms, non-child: 5.944us, % non-child: 0.00%) BytesReceived(500.0ms): 0, 0, 0, 0, 0, 0, 0 - BytesReceived: 0 - ConvertRowBatchTime: 817ns - DeserializeRowBatchTimer: 0ns - FirstBatchArrivalWaitTime: 3s362ms - PeakMemoryUsage: 0 - RowsReturned: 0 (0) - RowsReturnedRate: 0 - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Instance a7442ed1a43b8e8d:8ad38fb17f29baa2 (host=worker3.some.domain.net:22000):(Total: 3s603ms, non-child: 0ns, % non-child: 0.00%) MemoryUsage(500.0ms): 0, 3.14 MB, 3.14 MB, 3.14 MB, 3.14 MB, 3.15 MB, 3.15 MB, 3.16 MB ThreadUsage(500.0ms): 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - PeakMemoryUsage: 8.41 MB (8818688) - PerHostPeakMemUsage: 8.41 MB (8818688) - PrepareTime: 270.423ms - RowsProduced: 2 (2) - TotalCpuTime: 272.476ms - TotalNetworkReceiveTime: 3s331ms - TotalNetworkSendTime: 565.845us - TotalStorageWaitTime: 0ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 96 (96) - BlocksRecycled: 1 (1) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 38.40 GB (41231687680) - PeakMemoryUsage: 8.02 MB (8404992) - TotalBufferWaitTime: 0ns - TotalEncryptionTime: 0ns - TotalIntegrityCheckTime: 0ns - TotalReadBlockTime: 0ns CodeGen:(Total: 269.840ms, non-child: 269.840ms, % non-child: 100.00%) - CodegenTime: 423.130us - CompileTime: 250ns - LoadTime: 121.419us - ModuleFileSize: 2.23 MB (2333204) - OptimizationTime: 0ns - PrepareTime: 268.433ms DataStreamSender (dst_id=5):(Total: 43.283us, non-child: 43.283us, % non-child: 100.00%) - BytesSent: 50.00 B (50) - NetworkThroughput(*): 110.36 KB/sec - OverallThroughput: 1.10 MB/sec - PeakMemoryUsage: 4.00 KB (4096) - SerializeBatchTime: 17.90us - ThriftTransmitTime(*): 442.430us - UncompressedRowBatchSize: 76.00 B (76) SORT_NODE (id=2):(Total: 3s602ms, non-child: 473.974us, % non-child: 0.01%) - InMemorySortTime: 3.235us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.02 MB (8404992) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SortDataSize: 64.00 B (64) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 3s602ms, non-child: 270.772ms, % non-child: 7.52%) - BuildTime: 24.478us - GetNewBlockTime: 38.840us - GetResultsTime: 3.61us - HashBuckets: 2.05K (2048) - LargestPartitionPercent: 12 (12) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 3.15 MB (3301376) - PinTime: 0ns - RowsRepartitioned: 0 (0) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 874ns EXCHANGE_NODE (id=3):(Total: 3s331ms, non-child: 85.894us, % non-child: 0.00%) BytesReceived(500.0ms): 0, 0, 0, 0, 52.00 B, 52.00 B, 156.00 B - BytesReceived: 208.00 B (208) - ConvertRowBatchTime: 11.584us - DeserializeRowBatchTimer: 34.881us - FirstBatchArrivalWaitTime: 2s117ms - PeakMemoryUsage: 0 - RowsReturned: 8 (8) - RowsReturnedRate: 2.00 /sec - SendersBlockedTimer: 0ns - SendersBlockedTotalTimer(*): 0ns Fragment F00: Instance a7442ed1a43b8e8d:8ad38fb17f29baa8 (host=worker1.some.domain.net:22000):(Total: 3s327ms, non-child: 0ns, % non-child: 0.00%) Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:8/1.92 GB MemoryUsage(500.0ms): 4.73 MB, 4.76 MB, 4.77 MB, 4.76 MB, 4.76 MB, 4.73 MB, 4.23 MB
The new cluster:
+---------------------+----------+ | part_col | count(*) | +---------------------+----------+ | 2017-01-01 00:00:00 | 18428899 | | 2017-01-02 00:00:00 | 20974112 | | 2017-01-03 00:00:00 | 13178213 | | 2017-01-04 00:00:00 | 26768773 | | 2017-01-05 00:00:00 | 24148741 | +---------------------+----------+ Query Runtime Profile: Query (id=814dc21ba8355459:ac7e36724f454893): Summary: Session ID: 694ced478c51b5e5:321e569a7a0513b5 Session Type: BEESWAX Start Time: 2017-01-31 09:08:07.217250000 End Time: 2017-01-31 09:08:12.161790000 Query Type: QUERY Query State: FINISHED Query Status: OK Impala Version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0) User: test_user@SOME.DOMAIN.NET Connected User: test_user@SOME.DOMAIN.NET Delegated User: Network Address: ::ffff:xx.xx.xx.xx:55766 Default Db: default Sql Statement: select part_col, count(*) from perf_test group by part_col order by 1 Coordinator: worker3.some.domain.net:22000 Query Options (non default): Plan: ---------------- Estimated Per-Host Requirements: Memory=344.00MB VCores=2 WARNING: The following tables are missing relevant table and/or column statistics. default.perf_test 05:MERGING-EXCHANGE [UNPARTITIONED] | order by: part_col ASC | hosts=5 per-host-mem=unavailable | tuple-ids=2 row-size=24B cardinality=unavailable | 02:SORT | order by: part_col ASC | hosts=5 per-host-mem=0B | tuple-ids=2 row-size=24B cardinality=unavailable | 04:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: part_col | hosts=5 per-host-mem=128.00MB | tuple-ids=1 row-size=24B cardinality=unavailable | 03:EXCHANGE [HASH(part_col)] | hosts=5 per-host-mem=0B | tuple-ids=1 row-size=24B cardinality=unavailable | 01:AGGREGATE [STREAMING] | output: count(*) | group by: part_col | hosts=5 per-host-mem=128.00MB | tuple-ids=1 row-size=24B cardinality=unavailable | 00:SCAN HDFS [default.perf_test, RANDOM] partitions=1/1 files=32 size=7.22GB table stats: unavailable column stats: unavailable hosts=5 per-host-mem=88.00MB tuple-ids=0 row-size=16B cardinality=unavailable ---------------- Estimated Per-Host Mem: 360710144 Estimated Per-Host VCores: 2 Tables Missing Stats: default.perf_test Request Pool: root.test_user Admission result: Admitted immediately ExecSummary: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------- 05:MERGING-EXCHANGE 1 82.553us 82.553us 5 -1 0 -1.00 B UNPARTITIONED 02:SORT 5 434.217us 491.025us 5 -1 8.02 MB 0 04:AGGREGATE 5 279.461ms 316.351ms 5 -1 2.30 MB 128.00 MB FINALIZE 03:EXCHANGE 5 28.378us 55.658us 25 -1 0 0 HASH(part_col) 01:AGGREGATE 5 3s965ms 4s739ms 25 -1 1.59 MB 128.00 MB STREAMING 00:SCAN HDFS 5 83.393ms 118.345ms 103.50M -1 12.04 MB 88.00 MB default.perf_test Planner Timeline: 23.087ms - Analysis finished: 10.147ms (10.147ms) - Equivalence classes computed: 10.654ms (506.930us) - Single node plan created: 14.745ms (4.091ms) - Runtime filters computed: 14.787ms (41.721us) - Distributed plan created: 17.268ms (2.481ms) - Lineage info computed: 18.728ms (1.460ms) - Planning finished: 23.087ms (4.358ms) Query Timeline: 4s949ms - Start execution: 56.064us (56.064us) - Planning finished: 26.811ms (26.755ms) - Submit for admission: 27.851ms (1.039ms) - Completed admission: 27.929ms (77.878us) - Ready to start 10 remote fragments: 28.364ms (434.946us) - All 10 remote fragments started: 29.638ms (1.273ms) - Rows available: 4s860ms (4s831ms) - First row fetched: 4s942ms (81.424ms) - Unregister query: 4s944ms (2.126ms) ImpalaServer: - ClientFetchWaitTimer: 82.476ms - RowMaterializationTimer: 51.064us Execution Profile 814dc21ba8355459:ac7e36724f454893:(Total: 4s833ms, non-child: 0.000ns, % non-child: 0.00%) Number of filters: 0 Filter routing table: ID Src. Node Tgt. Node(s) Targets Target type Partition filter Pending (Expected) First arrived Completed ------------------------------------------------------------------------------------------------------------------- Fragment start latencies: Count: 10, 25th %-ile: 1ms, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 1ms Per Node Peak Memory Usage: worker2.some.domain.net:22000(23.93 MB) worker5.some.domain.net:22000(23.32 MB) worker1.some.domain.net:22000(23.85 MB) worker4.some.domain.net:22000(23.94 MB) worker3.some.domain.net:22000(23.86 MB) - FiltersReceived: 0 (0) - FinalizationTimer: 0.000ns Coordinator Fragment F02:(Total: 4s831ms, non-child: 255.854us, % non-child: 0.01%) MemoryUsage(500.000ms): 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB - AverageThreadTokens: 0.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 20.04 KB (20520) - PerHostPeakMemUsage: 0 - PrepareTime: 45.494us - RowsProduced: 5 (5) - TotalCpuTime: 4s914ms - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 0.000ns - TotalStorageWaitTime: 0.000ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 49 (49) - BlocksRecycled: 0 (0) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 61.98 GB (66552700928) - PeakMemoryUsage: 8.77 MB (9191424) - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns EXCHANGE_NODE (id=5):(Total: 4s831ms, non-child: 82.553us, % non-child: 0.00%) BytesReceived(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0 - BytesReceived: 134.00 B (134) - ConvertRowBatchTime: 0.000ns - DeserializeRowBatchTimer: 31.107us - FirstBatchArrivalWaitTime: 4s830ms - MergeGetNext: 25.594us - MergeGetNextBatch: 15.618us - PeakMemoryUsage: 0 - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns Averaged Fragment F01:(Total: 4s721ms, non-child: 0.000ns, % non-child: 0.00%) split sizes: min: 0, max: 0, avg: 0, stddev: 0 completion times: min:4s831ms max:4s832ms mean: 4s832ms stddev:302.637us execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec stddev:0.00 /sec num instances: 5 - AverageThreadTokens: 1.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 10.30 MB (10802320) - PerHostPeakMemUsage: 23.78 MB (24934251) - PrepareTime: 278.764ms - RowsProduced: 1 (1) - TotalCpuTime: 390.687ms - TotalNetworkReceiveTime: 4s440ms - TotalNetworkSendTime: 107.306us - TotalStorageWaitTime: 0.000ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 49 (49) - BlocksRecycled: 0 (0) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 61.98 GB (66552700928) - PeakMemoryUsage: 8.77 MB (9191424) - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns CodeGen:(Total: 381.581ms, non-child: 381.581ms, % non-child: 100.00%) - CodegenTime: 6.404ms - CompileTime: 7.476ms - LoadTime: 0.000ns - ModuleBitcodeSize: 2.15 MB (2254856) - NumFunctions: 29 (29) - NumInstructions: 550 (550) - OptimizationTime: 102.403ms - PrepareTime: 270.719ms DataStreamSender (dst_id=5):(Total: 68.002us, non-child: 68.002us, % non-child: 100.00%) - BytesSent: 26.00 B (26) - NetworkThroughput(*): 787.80 KB/sec - OverallThroughput: 295.48 KB/sec - PeakMemoryUsage: 4.00 KB (4096) - RowsReturned: 1 (1) - SerializeBatchTime: 19.405us - TransmitDataRPCTime: 21.326us - UncompressedRowBatchSize: 38.00 B (38) SORT_NODE (id=2):(Total: 4s720ms, non-child: 434.218us, % non-child: 0.01%) - InMemorySortTime: 1.484us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.01 MB (8401715) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SortDataSize: 32.00 B (32) - SpilledRuns: 0 (0) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 4s720ms, non-child: 279.461ms, % non-child: 5.92%) - BuildTime: 13.858us - GetNewBlockTime: 145.320us - GetResultsTime: 1.887us - HTResizeTime: 3.177us - HashBuckets: 1.02K (1024) - LargestPartitionPercent: 8 (8) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 2.29 MB (2401408) - PinTime: 0.000ns - RowsRepartitioned: 0 (0) - RowsReturned: 1 (1) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 1.046us EXCHANGE_NODE (id=3):(Total: 4s440ms, non-child: 4s440ms, % non-child: 100.00%) - BytesReceived: 133.00 B (133) - ConvertRowBatchTime: 4.099us - DeserializeRowBatchTimer: 27.477us - FirstBatchArrivalWaitTime: 2s625ms - PeakMemoryUsage: 0 - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns Averaged Fragment F00:(Total: 4s051ms, non-child: 0.000ns, % non-child: 0.00%) split sizes: min: 1.33 GB, max: 1.54 GB, avg: 1.44 GB, stddev: 78.16 MB completion times: min:1s806ms max:4s831ms mean: 4s052ms stddev:1s134ms execution rates: min:317.74 MB/sec max:756.58 MB/sec mean:412.43 MB/sec stddev:172.21 MB/sec num instances: 5 - AverageThreadTokens: 6.25 - BloomFilterBytes: 0 - PeakMemoryUsage: 13.49 MB (14146688) - PerHostPeakMemUsage: 23.78 MB (24934251) - PrepareTime: 281.243ms - RowsProduced: 5 (5) - TotalCpuTime: 26s501ms - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 506.296us - TotalStorageWaitTime: 136.659ms CodeGen:(Total: 274.654ms, non-child: 274.654ms, % non-child: 100.00%) - CodegenTime: 6.016ms - CompileTime: 0.000ns - LoadTime: 0.000ns - ModuleBitcodeSize: 2.15 MB (2254856) - NumFunctions: 0 (0) - NumInstructions: 0 (0) - OptimizationTime: 0.000ns - PrepareTime: 273.688ms DataStreamSender (dst_id=3):(Total: 147.360us, non-child: 147.360us, % non-child: 100.00%) - BytesSent: 133.00 B (133) - NetworkThroughput(*): 1.84 MB/sec - OverallThroughput: 885.74 KB/sec - PeakMemoryUsage: 32.00 KB (32768) - RowsReturned: 5 (5) - SerializeBatchTime: 48.621us - TransmitDataRPCTime: 73.238us - UncompressedRowBatchSize: 192.00 B (192) AGGREGATION_NODE (id=1):(Total: 4s049ms, non-child: 3s965ms, % non-child: 97.94%) - GetNewBlockTime: 78.103us - GetResultsTime: 9.538us - HTResizeTime: 523.269us - HashBuckets: 8.19K (8192) - LargestPartitionPercent: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 1.59 MB (1668224) - PinTime: 0.000ns - ReductionFactorEstimate: 0.00 - ReductionFactorThresholdToExpand: 0.00 - RowsPassedThrough: 0 (0) - RowsReturned: 5 (5) - RowsReturnedRate: 1.00 /sec - StreamingTime: 3s666ms - UnpinTime: 0.000ns HDFS_SCAN_NODE (id=0):(Total: 83.393ms, non-child: 83.393ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.00 - AverageScannerThreadConcurrency: 5.63 - BytesRead: 650.78 KB (666403) - BytesReadDataNodeCache: 0 - BytesReadLocal: 650.78 KB (666403) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 650.78 KB (666403) - DecompressionTime: 345.570us - MaxCompressedTextFileLength: 0 - NumColumns: 1 (1) - NumDisksAccessed: 6 (6) - NumRowGroups: 6 (6) - NumScannerThreadsStarted: 6 (6) - PeakMemoryUsage: 11.89 MB (12465356) - PerReadThreadRawHdfsThroughput: 686.45 MB/sec - RemoteScanRanges: 0 (0) - RowsRead: 20.70M (20699747) - RowsReturned: 20.70M (20699747) - RowsReturnedRate: 282.05 M/sec - ScanRangesComplete: 6 (6) - ScannerThreadsInvoluntaryContextSwitches: 12 (12) - ScannerThreadsTotalWallClockTime: 22s586ms - MaterializeTupleTime(*): 22s440ms - ScannerThreadsSysTime: 32.800ms - ScannerThreadsUserTime: 666.440ms - ScannerThreadsVoluntaryContextSwitches: 20.07K (20074) - TotalRawHdfsReadTime(*): 1.368ms - TotalReadThroughput: 181.83 KB/sec Fragment F01: Instance 814dc21ba8355459:ac7e36724f454896 (host=worker1.some.domain.net:22000):(Total: 4s773ms, non-child: 0.000ns, % non-child: 0.00%) MemoryUsage(500.000ms): 10.28 MB, 10.28 MB, 10.28 MB, 10.28 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.31 MB ThreadUsage(500.000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 10.31 MB (10810528) - PerHostPeakMemUsage: 23.85 MB (25013504) - PrepareTime: 140.565ms - RowsProduced: 2 (2) - TotalCpuTime: 198.707ms - TotalNetworkReceiveTime: 4s632ms - TotalNetworkSendTime: 144.042us - TotalStorageWaitTime: 0.000ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 49 (49) - BlocksRecycled: 0 (0) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 61.98 GB (66552700928) - PeakMemoryUsage: 8.77 MB (9191424) - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns CodeGen:(Total: 193.769ms, non-child: 193.769ms, % non-child: 100.00%) - CodegenTime: 3.318ms - CompileTime: 3.802ms - LoadTime: 0.000ns - ModuleBitcodeSize: 2.15 MB (2254856) - NumFunctions: 29 (29) - NumInstructions: 550 (550) - OptimizationTime: 53.125ms - PrepareTime: 136.382ms DataStreamSender (dst_id=5):(Total: 119.588us, non-child: 119.588us, % non-child: 100.00%) - BytesSent: 50.00 B (50) - NetworkThroughput(*): 1.84 MB/sec - OverallThroughput: 408.30 KB/sec - PeakMemoryUsage: 4.00 KB (4096) - RowsReturned: 2 (2) - SerializeBatchTime: 73.063us - TransmitDataRPCTime: 25.979us - UncompressedRowBatchSize: 76.00 B (76) SORT_NODE (id=2):(Total: 4s773ms, non-child: 271.839us, % non-child: 0.01%) ExecOption: Codegen Enabled - InMemorySortTime: 1.324us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.02 MB (8404992) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SortDataSize: 64.00 B (64) - SpilledRuns: 0 (0) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 4s773ms, non-child: 140.842ms, % non-child: 2.95%) ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI - BuildTime: 12.344us - GetNewBlockTime: 87.985us - GetResultsTime: 1.797us - HTResizeTime: 1.646us - HashBuckets: 2.05K (2048) - LargestPartitionPercent: 10 (10) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 2.30 MB (2409600) - PinTime: 0.000ns - RowsRepartitioned: 0 (0) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 616.000ns EXCHANGE_NODE (id=3):(Total: 4s632ms, non-child: 28.353us, % non-child: 0.00%) BytesReceived(500.000ms): 0, 0, 0, 0, 52.00 B, 52.00 B, 52.00 B, 52.00 B, 52.00 B, 208.00 B - BytesReceived: 260.00 B (260) - ConvertRowBatchTime: 4.176us - DeserializeRowBatchTimer: 25.676us - FirstBatchArrivalWaitTime: 1s607ms - PeakMemoryUsage: 0 - RowsReturned: 10 (10) - RowsReturnedRate: 2.00 /sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns Instance 814dc21ba8355459:ac7e36724f454897 (host=worker5.some.domain.net:22000):(Total: 4s709ms, non-child: 0.000ns, % non-child: 0.00%) MemoryUsage(500.000ms): 0, 10.28 MB, 10.28 MB, 10.28 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.29 MB, 10.30 MB ThreadUsage(500.000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 10.31 MB (10810528) - PerHostPeakMemUsage: 23.32 MB (24449296) - PrepareTime: 311.018ms - RowsProduced: 2 (2) - TotalCpuTime: 435.142ms - TotalNetworkReceiveTime: 4s396ms - TotalNetworkSendTime: 187.827us - TotalStorageWaitTime: 0.000ns BlockMgr: - BlockWritesOutstanding: 0 (0) - BlocksCreated: 49 (49) - BlocksRecycled: 0 (0) - BufferedPins: 0 (0) - BytesWritten: 0 - MaxBlockSize: 8.00 MB (8388608) - MemoryLimit: 61.98 GB (66552700928) - PeakMemoryUsage: 8.77 MB (9191424) - TotalBufferWaitTime: 0.000ns - TotalEncryptionTime: 0.000ns - TotalIntegrityCheckTime: 0.000ns - TotalReadBlockTime: 0.000ns CodeGen:(Total: 425.412ms, non-child: 425.412ms, % non-child: 100.00%) - CodegenTime: 6.872ms - CompileTime: 8.123ms - LoadTime: 0.000ns - ModuleBitcodeSize: 2.15 MB (2254856) - NumFunctions: 29 (29) - NumInstructions: 550 (550) - OptimizationTime: 113.870ms - PrepareTime: 302.324ms DataStreamSender (dst_id=5):(Total: 69.893us, non-child: 69.893us, % non-child: 100.00%) - BytesSent: 53.00 B (53) - NetworkThroughput(*): 1.31 MB/sec - OverallThroughput: 740.53 KB/sec - PeakMemoryUsage: 4.00 KB (4096) - RowsReturned: 2 (2) - SerializeBatchTime: 11.380us - TransmitDataRPCTime: 38.673us - UncompressedRowBatchSize: 76.00 B (76) SORT_NODE (id=2):(Total: 4s708ms, non-child: 491.025us, % non-child: 0.01%) ExecOption: Codegen Enabled - InMemorySortTime: 2.603us - InitialRunsCreated: 1 (1) - PeakMemoryUsage: 8.02 MB (8404992) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SortDataSize: 64.00 B (64) - SpilledRuns: 0 (0) - TotalMergesPerformed: 0 (0) AGGREGATION_NODE (id=4):(Total: 4s708ms, non-child: 311.656ms, % non-child: 6.62%) ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI - BuildTime: 21.607us - GetNewBlockTime: 144.739us - GetResultsTime: 5.404us - HTResizeTime: 3.560us - HashBuckets: 2.05K (2048) - LargestPartitionPercent: 10 (10) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 2.30 MB (2409600) - PinTime: 0.000ns - RowsRepartitioned: 0 (0) - RowsReturned: 2 (2) - RowsReturnedRate: 0 - SpilledPartitions: 0 (0) - UnpinTime: 1.295us EXCHANGE_NODE (id=3):(Total: 4s396ms, non-child: 42.905us, % non-child: 0.00%) BytesReceived(500.000ms): 0, 0, 0, 51.00 B, 51.00 B, 51.00 B, 51.00 B, 51.00 B, 154.00 B - BytesReceived: 258.00 B (258) - ConvertRowBatchTime: 6.460us - DeserializeRowBatchTimer: 67.686us - FirstBatchArrivalWaitTime: 1s371ms - PeakMemoryUsage: 0 - RowsReturned: 10 (10) - RowsReturnedRate: 2.00 /sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns
Created 02-03-2017 03:04 PM
It's not entirely obvious to me what changed. Since it's grouping by timestamp, you're running into https://issues.cloudera.org/browse/IMPALA-3884, which is forcing it to execute the aggregation in an interpreted mode. I suspect that the performance of that interpreted code path may have changed a bit - we tend to focus most of our tuning on the codegened code path.
On releases with IMPALA-3884 fixed it should be a *lot* faster.
" ExecOption: Codegen Disabled: HashTableCtx::CodegenEvalRow(): type TIMESTAMP NYI"
Created 02-08-2017 05:26 AM
It looks like the problem is really in the timestamp field. Running a similar query on table without timestamp show much better results on the new environment. Thanks for the help