Support Questions
Find answers, ask questions, and share your expertise

Impala slow aggregation

New Contributor

Hello All,

 

I am facing an issue with aggregation, I tested both in Impala 2.5 and Impala 2.8 and I got the same results. the query is simple aggregation with few CASE WHEN. I am also cross joining to a reference table which I broadcast to the big table (this reference table is 20 rows) 

 

Snippet from the plan

 

01:SCAN HDFS [impala_poc_test.event_final_hour ed, RANDOM]
   partitions=6/1824 files=6 size=234.07MB   predicates: ((event_date_time >= 20161122170000 AND event_date_time < 20161122180000) OR (event_date_time >= 20161129170000 AND event_date_time < 20161129180000) OR (event_date_time >= 20161206170000 AND event_date_time < 20161206180000) OR (event_date_time >= 20161213170000 AND event_date_time < 20161213180000) OR (event_date_time >= 20161201700000 AND event_date_time < 20161220180000) OR (event_date_time >= 20161227170000 AND event_date_time < 20161227180000))
   table stats: 1299660217 rows total
   column stats: all
   hosts=5 per-host-mem=576.00MB
   tuple-ids=3 row-size=584B cardinality=290812
----------------
    Estimated Per-Host Mem: 1492518982
    Estimated Per-Host VCores: 2
    Request Pool: default-pool
    Admission result: Admitted immediately
    ExecSummary: 
Operator              #Hosts   Avg Time   Max Time   #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail                         
---------------------------------------------------------------------------------------------------------------------------------
10:EXCHANGE                1    0.000ns    0.000ns       0     322.39K          0        -1.00 B  UNPARTITIONED                  
09:AGGREGATE               5    4.999ms   19.000ms   4.84K     322.39K    2.69 MB       27.06 MB  FINALIZE                       
08:EXCHANGE                5  400.000us    2.000ms  40.77K       3.22M          0              0  HASH(name,value)               
04:AGGREGATE               5    1s745ms    1s917ms  84.55K       3.22M   10.05 MB      270.56 MB  STREAMING                      
07:AGGREGATE               5    5s213ms    6s022ms  30.03M       5.82M  778.19 MB      423.69 MB  FINALIZE                       
06:EXCHANGE                5  235.600ms  297.000ms  31.91M       5.82M          0              0  HASH(account_id,name,CASE W... 
03:AGGREGATE               5    7s756ms   11s435ms  31.91M       5.82M    1.01 GB      423.69 MB  STREAMING                      
02:NESTED LOOP JOIN        5  522.800ms  789.000ms  58.16M       5.82M    1.77 MB       320.00 B  CROSS JOIN, BROADCAST          
|--05:EXCHANGE             5    0.000ns    0.000ns      20          20          0              0  BROADCAST                      
|  00:UNION                1    0.000ns    0.000ns      20          20          0        -1.00 B                                 
01:SCAN HDFS               5   53.200ms   68.000ms   2.91M     290.81K   60.32 MB      576.00 MB  impala_poc_test.event_...     Planner Timeline: 26.186ms
       - Analysis finished: 2.850ms (2.850ms)
       - Equivalence classes computed: 5.712ms (2.861ms)
       - Single node plan created: 10.809ms (5.096ms)
       - Runtime filters computed: 10.904ms (95.092us)
       - Distributed plan created: 11.898ms (994.128us)
       - Planning finished: 26.186ms (14.288ms)
    Query Timeline: 28s413ms       - Query submitted: 0.000ns (0.000ns)
       - Planning finished: 37.000ms (37.000ms)
       - Submit for admission: 39.000ms (2.000ms)
       - Completed admission: 39.000ms (0.000ns)
       - Ready to start 17 fragment instances: 39.000ms (0.000ns)
       - All 17 fragment instances started: 52.000ms (13.000ms)
       - Rows available: 28s381ms (28s329ms)
       - First row fetched: 28s384ms (3.000ms)
       - Unregister query: 28s395ms (11.000ms)
     - ComputeScanRangeAssignmentTimer: 0.000ns
  ImpalaServer:
     - ClientFetchWaitTimer: 12.000ms
     - RowMaterializationTimer: 2.000ms
  Execution Profile 644f639b8d2a4480:f8eac30600000000:(Total: 28s338ms, non-child: 0.000ns, % non-child: 0.00%)
    Number of filters: 0
    Filter routing table: 
 ID  Src. Node  Tgt. Node(s)  Targets  Target type  Partition filter  Pending (Expected)  First arrived  Completed   Enabled
----------------------------------------------------------------------------------------------------------------------------

    Fragment instance start latencies: Count: 17, 25th %-ile: 3ms, 50th %-ile: 3ms, 75th %-ile: 4ms, 90th %-ile: 5ms, 95th %-ile: 6ms, 99.9th %-ile: 8ms
    Per Node Peak Memory Usage: etl002-slv-02:22000(871.29 MB) etl002-slv-03:22000(1.01 GB) etl002-slv-05:22000(840.38 MB) etl002-slv-04:22000(1.63 GB) etl002-slv-01:22000(1.06 GB) 
     - FiltersReceived: 0 (0)
     - FinalizationTimer: 0.000ns
    Coordinator Fragment F04:
      Instance 644f639b8d2a4480:f8eac30600000000 (host=etl002-slv-01:22000):(Total: 42.000ms, non-child: 0.000ns, % non-child: 0.00%)
        MemoryUsage(500.000ms): 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB, 8.00 KB
         - AverageThreadTokens: 0.00 
         - BloomFilterBytes: 0
         - PeakMemoryUsage: 8.00 KB (8192)
         - PerHostPeakMemUsage: 1.06 GB (1134475162)
         - RowsProduced: 0 (0)
         - TotalNetworkReceiveTime: 0.000ns
         - TotalNetworkSendTime: 0.000ns
         - TotalStorageWaitTime: 0.000ns
         - TotalThreadsInvoluntaryContextSwitches: 0 (0)
         - TotalThreadsTotalWallClockTime: 0.000ns
           - TotalThreadsSysTime: 0.000ns
           - TotalThreadsUserTime: 0.000ns
         - TotalThreadsVoluntaryContextSwitches: 0 (0)
        Fragment Instance Lifecycle Timings:
           - OpenTime: 0.000ns
             - ExecTreeOpenTime: 0.000ns
           - PrepareTime: 42.000ms
             - ExecTreePrepareTime: 0.000ns
        BlockMgr:
           - BlockWritesOutstanding: 0 (0)
           - BlocksCreated: 225 (225)
           - BlocksRecycled: 47 (47)
           - BufferedPins: 0 (0)
           - BytesWritten: 0
           - MaxBlockSize: 8.00 MB (8388608)
           - MemoryLimit: 30.05 GB (32263944192)
           - PeakMemoryUsage: 1.04 GB (1112014848)
           - ScratchFileUsedBytes: 0
           - TotalBufferWaitTime: 0.000ns
           - TotalEncryptionTime: 0.000ns
           - TotalReadBlockTime: 0.000ns
        PLAN_ROOT_SINK:
           - PeakMemoryUsage: 0
        CodeGen:(Total: 42.000ms, non-child: 42.000ms, % non-child: 100.00%)
           - CodegenTime: 0.000ns
           - CompileTime: 0.000ns
           - LoadTime: 0.000ns
           - ModuleBitcodeSize: 1.91 MB (1997620)
           - NumFunctions: 0 (0)
           - NumInstructions: 0 (0)
           - OptimizationTime: 0.000ns
           - PeakMemoryUsage: 0
           - PrepareTime: 41.000ms
        EXCHANGE_NODE (id=10):
          BytesReceived(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
           - BytesReceived: 0
           - ConvertRowBatchTime: 0.000ns
           - DeserializeRowBatchTimer: 0.000ns
           - FirstBatchArrivalWaitTime: 0.000ns
           - PeakMemoryUsage: 0
           - RowsReturned: 0 (0)
           - RowsReturnedRate: 0
           - SendersBlockedTimer: 0.000ns
           - SendersBlockedTotalTimer(*): 0.000ns
    Averaged Fragment F03:(Total: 11s361ms, non-child: 0.000ns, % non-child: 0.00%)
      split sizes:  min: 0, max: 0, avg: 0, stddev: 0
      completion times: min:28s346ms  max:28s352ms  mean: 28s349ms  stddev:2.059ms
      execution rates: min:0.00 /sec  max:0.00 /sec  mean:0.00 /sec  stddev:0.00 /sec
      num instances: 5
       - AverageThreadTokens: 1.00 
       - BloomFilterBytes: 0
       - PeakMemoryUsage: 2.86 MB (2993872)
       - PerHostPeakMemUsage: 1.07 GB (1152354395)
       - RowsProduced: 968 (968)
       - TotalNetworkReceiveTime: 16s742ms       - TotalNetworkSendTime: 1.800ms
       - TotalStorageWaitTime: 0.000ns
       - TotalThreadsInvoluntaryContextSwitches: 11 (11)
       - TotalThreadsTotalWallClockTime: 11s317ms         - TotalThreadsSysTime: 3.199ms
         - TotalThreadsUserTime: 103.984ms
       - TotalThreadsVoluntaryContextSwitches: 44 (44)
      Fragment Instance Lifecycle Timings:
         - ExecTime: 9.500ms
           - ExecTreeExecTime: 3.000ms
         - OpenTime: 11s313ms
           - ExecTreeOpenTime: 11s209ms
         - PrepareTime: 44.599ms
           - ExecTreePrepareTime: 2.799ms
      BlockMgr:
         - BlockWritesOutstanding: 0 (0)
         - BlocksCreated: 229 (229)
         - BlocksRecycled: 53 (53)
         - BufferedPins: 0 (0)
         - BytesWritten: 0
         - MaxBlockSize: 8.00 MB (8388608)
         - MemoryLimit: 30.05 GB (32263944192)
         - PeakMemoryUsage: 1.06 GB (1134034944)
         - ScratchFileUsedBytes: 0
         - TotalBufferWaitTime: 0.000ns
         - TotalEncryptionTime: 0.000ns
         - TotalReadBlockTime: 0.000ns
      DataStreamSender (dst_id=10):(Total: 800.000us, non-child: 800.000us, % non-child: 100.00%)
         - BytesSent: 49.98 KB (51179)
         - NetworkThroughput(*): 48.81 MB/sec
         - OverallThroughput: 32.30 MB/sec
         - PeakMemoryUsage: 1.58 KB (1616)
         - RowsReturned: 968 (968)
         - SerializeBatchTime: 400.000us
         - TransmitDataRPCTime: 400.000us
         - UncompressedRowBatchSize: 120.88 KB (123784)
      CodeGen:(Total: 293.599ms, non-child: 293.599ms, % non-child: 100.00%)
         - CodegenTime: 5.400ms
         - CompileTime: 75.600ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 1.91 MB (1997620)
         - NumFunctions: 60 (60)
         - NumInstructions: 1.14K (1136)
         - OptimizationTime: 180.000ms
         - PeakMemoryUsage: 568.00 KB (581632)
         - PrepareTime: 36.599ms
      AGGREGATION_NODE (id=9):(Total: 11s213ms, non-child: 0.000ns, % non-child: 0.00%)
         - BuildTime: 1.400ms
         - GetNewBlockTime: 2.000ms
         - GetResultsTime: 0.000ns
         - HTResizeTime: 200.000us
         - HashBuckets: 13.11K (13107)
         - LargestPartitionPercent: 0 (0)
         - MaxPartitionLevel: 0 (0)
         - NumRepartitions: 0 (0)
         - PartitionsCreated: 16 (16)
         - PeakMemoryUsage: 2.53 MB (2654828)
         - PinTime: 0.000ns
         - RowsRepartitioned: 0 (0)
         - RowsReturned: 968 (968)
         - RowsReturnedRate: 34.00 /sec
         - SpilledPartitions: 0 (0)
         - UnpinTime: 0.000ns
      EXCHANGE_NODE (id=8):(Total: 16s742ms, non-child: 16s742ms, % non-child: 100.00%)
         - BytesReceived: 406.37 KB (416126)
         - ConvertRowBatchTime: 200.000us
         - DeserializeRowBatchTimer: 1.000ms
         - FirstBatchArrivalWaitTime: 16s562ms
         - PeakMemoryUsage: 0
         - RowsReturned: 8.15K (8154)
         - RowsReturnedRate: 291.00 /sec
         - SendersBlockedTimer: 0.000ns
         - SendersBlockedTotalTimer(*): 0.000ns
    Fragment F03:
      Instance 644f639b8d2a4480:d (host=etl002-slv-05:22000):(Total: 28s347ms, non-child: 11.000ms, % non-child: 0.04%)
        MemoryUsage(500.000ms): 2.30 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.31 MB, 2.61 MB
        ThreadUsage(500.000ms): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
         - AverageThreadTokens: 1.00 
         - BloomFilterBytes: 0
         - PeakMemoryUsage: 2.86 MB (2993872)
         - PerHostPeakMemUsage: 840.38 MB (881203619)
         - RowsProduced: 2.39K (2389)
         - TotalNetworkReceiveTime: 28s036ms
         - TotalNetworkSendTime: 7.000ms
         - TotalStorageWaitTime: 0.000ns
         - TotalThreadsInvoluntaryContextSwitches: 29 (29)
         - TotalThreadsTotalWallClockTime: 28s309ms
           - TotalThreadsSysTime: 9.998ms
           - TotalThreadsUserTime: 256.961ms
         - TotalThreadsVoluntaryContextSwitches: 116 (116)
        Fragment Instance Lifecycle Timings:
           - ExecTime: 10.000ms
             - ExecTreeExecTime: 2.000ms
           - OpenTime: 28s299ms
             - ExecTreeOpenTime: 28s040ms
           - PrepareTime: 38.999ms
             - ExecTreePrepareTime: 999.992us
        BlockMgr:
           - BlockWritesOutstanding: 0 (0)
           - BlocksCreated: 203 (203)
           - BlocksRecycled: 69 (69)
           - BufferedPins: 0 (0)
           - BytesWritten: 0
           - MaxBlockSize: 8.00 MB (8388608)
           - MemoryLimit: 30.05 GB (32263944192)
           - PeakMemoryUsage: 818.50 MB (858259456)
           - ScratchFileUsedBytes: 0
           - TotalBufferWaitTime: 0.000ns
           - TotalEncryptionTime: 0.000ns
           - TotalReadBlockTime: 0.000ns
        DataStreamSender (dst_id=10):(Total: 1.000ms, non-child: 1.000ms, % non-child: 100.00%)
           - BytesSent: 123.14 KB (126093)
           - NetworkThroughput(*): 120.25 MB/sec
           - OverallThroughput: 120.25 MB/sec
           - PeakMemoryUsage: 1.58 KB (1616)
           - RowsReturned: 2.39K (2389)
           - SerializeBatchTime: 0.000ns
           - TransmitDataRPCTime: 1.000ms
           - UncompressedRowBatchSize: 298.35 KB (305506)
        CodeGen:(Total: 292.999ms, non-child: 292.999ms, % non-child: 100.00%)
           - CodegenTime: 5.000ms
           - CompileTime: 76.000ms
           - LoadTime: 0.000ns
           - ModuleBitcodeSize: 1.91 MB (1997620)
           - NumFunctions: 60 (60)
           - NumInstructions: 1.14K (1136)
           - OptimizationTime: 180.000ms
           - PeakMemoryUsage: 568.00 KB (581632)
           - PrepareTime: 35.999ms

below is the full plan,

 

https://gist.github.com/okouaihi/23d01ae9dd9a9ad04a3bea86bb272e72

 

Thanks

 

7 REPLIES 7

Re: Impala slow aggregation

Expert Contributor
Hi madachka,

It looks like the cross join results in 60M rows being sent to the
aggregation. Can you post the full query? What is the result you're
expecting here?

Cheers, Lars

Re: Impala slow aggregation

New Contributor

Thanks Lars,

 

You're right the CROSS JOIN is about to aggregate about 60M rows.

 

Basically, I am trying to pivot columns to rows and do aggregation based on columns values. the result I am looking for is:

ColumName | ColumnValue | SUM(event) for period 1| SUM(event) for period 2 | .. etc

 

Full query below

https://gist.github.com/okouaihi/c4cc49f0c381f6452aac81055b395ea8

 

Thanks for your help.

Re: Impala slow aggregation

Expert Contributor

Unfortunately I don't know how to speed up your particular query. You may want to have a look at Impala's query hints, especially the section about "Hints for join queries" on this page: https://www.cloudera.com/documentation/enterprise/latest/topics/impala_hints.html

 

If that doesn't get you anywhere, maybe someone else here has an idea.

 

Cheers, Lars

Re: Impala slow aggregation

Master Collaborator

Impala currently only uses a single thread per node to perform the aggregations, so your response time is certainly limited by that. We are actively working on fully multi-threaded execution.

 

That said, I cannot say how much improvement you might get from this, but I'd recommend using integers for representing your columns_list. Your CASE expression for the 'value' could have up to 20 string comparisons per row (and 10 on average). Using integers will generally be faster. If you try this, please do follow up since I'm curious about the results 🙂

Re: Impala slow aggregation

Master Collaborator

Setting this query option might also be worth a try:

SET disable_streaming_preaggregations=TRUE;

Re: Impala slow aggregation

New Contributor

Thanks Lars, Alex,

 

Using other hints nor "disable_streaming_preaggregations" parameter did not give better performance.

 

In the other hand, when I change CASE expression comparison from STRING to INTEGER, the query time is 15% faster 🙂

 

 

Looking forward for the multi-threads per node support for aggregation.

 

Thanks again!

 

 

Re: Impala slow aggregation

Master Collaborator

Thanks for following up! Glad to hear you could squeeze out some performance.