Reply
Highlighted
Explorer
Posts: 9
Registered: ‎01-25-2017
Accepted Solution

Imapla query is slow, where is the bottleneck?

I have this query:

SELECT *
FROM ft_test, ft_wafer
WHERE ft_test_parquet.id = ft_wafer_parquet.id
and month = 1
and day = 8 
and param = 2913;

It took about 3.5s. I want this to be faster. I have ready used techniques like Parquet, partition, compute stats

 

Here is the profile. Could someone look through this and find the bottleneck? Thanks.

 

Query Runtime Profile:
Query (id=dc4c084615fbf9bb:4261466f00000000):
  Summary:
    Session ID: 5d4edbf63653cdf6:1a59ff5354c9d4bd
    Session Type: BEESWAX
    Start Time: 2017-05-25 16:31:25.121391000
    End Time: 2017-05-25 16:31:28.584404000
    Query Type: QUERY
    Query State: FINISHED
    Query Status: OK
    Impala Version: impalad version 2.7.0 RELEASE (build a535b583202c4a81080098a10f952d377af1949d)
    User: root
    Connected User: root
    Delegated User:
    Network Address: ::ffff:127.0.0.1:58546
    Default Db: inspex
    Sql Statement: select *
FROM ft_test_partition, ft_wafer_parquet
WHERE ft_test_partition.file = ft_wafer_parquet.file
and month = 1
and day = 8
and param = 2913 limit 100
    Coordinator: mapr1:22000
    Query Options (non default):
    Plan:
----------------
Estimated Per-Host Requirements: Memory=704.67MB VCores=2

04:EXCHANGE [UNPARTITIONED]
|  limit: 100
|  hosts=1 per-host-mem=unavailable
|  tuple-ids=1,0 row-size=800B cardinality=1
|
02:HASH JOIN [INNER JOIN, BROADCAST]
|  hash predicates: ft_wafer_parquet.file = ft_test_partition.file
|  runtime filters: RF000 <- ft_test_partition.file
|  limit: 100
|  hosts=1 per-host-mem=690.00KB
|  tuple-ids=1,0 row-size=800B cardinality=1
|
|--03:EXCHANGE [BROADCAST]
|  |  hosts=1 per-host-mem=0B
|  |  tuple-ids=0 row-size=78B cardinality=8235
|  |
|  00:SCAN HDFS [inspex.ft_test_partition, RANDOM]
|     partitions=1/29 files=1 size=171.69MB
|     predicates: param = 2913
|     table stats: 813365826 rows total
|     column stats: all
|     hosts=1 per-host-mem=704.00MB
|     tuple-ids=0 row-size=78B cardinality=8235
|
01:SCAN HDFS [inspex.ft_wafer_parquet, RANDOM]
   partitions=1/1 files=1 size=66.83KB
   runtime filters: RF000 -> ft_wafer_parquet.file
   table stats: 1500 rows total
   column stats: all
   hosts=1 per-host-mem=192.00MB
   tuple-ids=1 row-size=722B cardinality=1500
----------------
    Estimated Per-Host Mem: 738904067
    Estimated Per-Host VCores: 2
    Request Pool: default-pool
    Admission result: Admitted immediately
    ExecSummary:
Operator          #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail
----------------------------------------------------------------------------------------------------------------------
04:EXCHANGE            1    0.000ns    0.000ns    100           1          0        -1.00 B  UNPARTITIONED
02:HASH JOIN           1   42.999ms   42.999ms      0           1    3.29 MB      690.00 KB  INNER JOIN, BROADCAST
|--03:EXCHANGE         1  999.990us  999.990us  9.15K       8.23K          0              0  BROADCAST
|  00:SCAN HDFS        1    2s973ms    2s973ms  9.15K       8.23K  106.05 MB      704.00 MB  inspex.ft_test_partition
01:SCAN HDFS           1   16.999ms   16.999ms  1.02K       1.50K    1.78 MB      192.00 MB  inspex.ft_wafer_parquet
    Planner Timeline: 15.315ms
       - Analysis finished: 5.081ms (5.081ms)
       - Equivalence classes computed: 5.601ms (519.374us)
       - Single node plan created: 9.054ms (3.453ms)
       - Runtime filters computed: 9.409ms (354.377us)
       - Distributed plan created: 11.507ms (2.098ms)
       - Planning finished: 15.315ms (3.808ms)
    Query Timeline: 3s463ms
       - Start execution: 0.000ns (0.000ns)
       - Planning finished: 17.999ms (17.999ms)
       - Submit for admission: 17.999ms (0.000ns)
       - Completed admission: 17.999ms (0.000ns)
       - Ready to start 2 remote fragments: 18.999ms (999.990us)
       - All 2 remote fragments started: 19.999ms (999.990us)
       - Rows available: 3s246ms (3s226ms)
       - First row fetched: 3s346ms (99.999ms)
       - Unregister query: 3s462ms (115.998ms)
     - ComputeScanRangeAssignmentTimer: 0.000ns
  ImpalaServer:
     - ClientFetchWaitTimer: 214.997ms
     - RowMaterializationTimer: 999.990us
  Execution Profile dc4c084615fbf9bb:4261466f00000000:(Total: 3s228ms, non-child: 0.000ns, % non-child: 0.00%)
    Number of filters: 1
    Filter routing table:
 ID  Src. Node  Tgt. Node(s)  Targets  Target type  Partition filter  Pending (Expected)  First arrived  Completed   Enabled
----------------------------------------------------------------------------------------------------------------------------
  0          2             1        1        LOCAL             false               0 (1)            N/A        N/A      true
    Fragment start latencies: Count: 2, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 1ms
    Final filter table:
 ID  Src. Node  Tgt. Node(s)  Targets  Target type  Partition filter  Pending (Expected)  First arrived  Completed   Enabled
----------------------------------------------------------------------------------------------------------------------------
  0          2             1        1        LOCAL             false               0 (1)            N/A        N/A      true
    Per Node Peak Memory Usage: mapr1:22000(108.65 MB)
     - FiltersReceived: 0 (0)
     - FinalizationTimer: 0.000ns
    Coordinator Fragment F02:(Total: 3s226ms, non-child: 0.000ns, % non-child: 0.00%)
      MemoryUsage(500.000ms): 16.00 KB, 16.00 KB, 16.00 KB, 16.00 KB, 16.00 KB, 16.00 KB, 16.00 KB
       - AverageThreadTokens: 0.00
       - BloomFilterBytes: 0
       - PeakMemoryUsage: 209.83 KB (214864)
       - PerHostPeakMemUsage: 0
       - PrepareTime: 0.000ns
       - RowsProduced: 0 (0)
       - TotalCpuTime: 101.999ms
       - TotalNetworkReceiveTime: 3s226ms
       - TotalNetworkSendTime: 0.000ns
       - TotalStorageWaitTime: 0.000ns
      BlockMgr:
         - BlockWritesOutstanding: 0 (0)
         - BlocksCreated: 48 (48)
         - BlocksRecycled: 0 (0)
         - BufferedPins: 0 (0)
         - BytesWritten: 0
         - MaxBlockSize: 8.00 MB (8388608)
         - MemoryLimit: 12.21 GB (13111148544)
         - PeakMemoryUsage: 256.00 KB (262144)
         - TotalBufferWaitTime: 0.000ns
         - TotalEncryptionTime: 0.000ns
         - TotalIntegrityCheckTime: 0.000ns
         - TotalReadBlockTime: 0.000ns
      EXCHANGE_NODE (id=4):(Total: 3s226ms, non-child: 0.000ns, % non-child: 0.00%)
        BytesReceived(500.000ms): 0, 0, 0, 0, 0, 0, 0
         - BytesReceived: 61.05 KB (62513)
         - ConvertRowBatchTime: 0.000ns
         - DeserializeRowBatchTimer: 0.000ns
         - FirstBatchArrivalWaitTime: 3s226ms
         - PeakMemoryUsage: 0
         - RowsReturned: 100 (100)
         - RowsReturnedRate: 30.00 /sec
         - SendersBlockedTimer: 0.000ns
         - SendersBlockedTotalTimer(*): 0.000ns
    Averaged Fragment F00:(Total: 3s001ms, non-child: 0.000ns, % non-child: 0.00%)
      split sizes:  min: 66.83 KB, max: 66.83 KB, avg: 66.83 KB, stddev: 0
      completion times: min:3s227ms  max:3s227ms  mean: 3s227ms  stddev:0.000ns
      execution rates: min:20.70 KB/sec  max:20.70 KB/sec  mean:20.70 KB/sec  stddev:0.00 /sec
      num instances: 1
       - AverageThreadTokens: 1.86
       - BloomFilterBytes: 1.00 MB (1048576)
       - PeakMemoryUsage: 5.07 MB (5320864)
       - PerHostPeakMemUsage: 108.65 MB (113924736)
       - PrepareTime: 38.999ms
       - RowsProduced: 1.02K (1024)
       - TotalCpuTime: 3s232ms
       - TotalNetworkReceiveTime: 2s940ms
       - TotalNetworkSendTime: 0.000ns
       - TotalStorageWaitTime: 13.999ms
      CodeGen:(Total: 262.997ms, non-child: 262.997ms, % non-child: 100.00%)
         - CodegenTime: 999.990us
         - CompileTime: 73.999ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 1.86 MB (1953028)
         - NumFunctions: 85 (85)
         - NumInstructions: 2.86K (2857)
         - OptimizationTime: 151.998ms
         - PrepareTime: 36.999ms
      DataStreamSender (dst_id=4):(Total: 999.990us, non-child: 999.990us, % non-child: 100.00%)
         - BytesSent: 61.05 KB (62513)
         - NetworkThroughput(*): 0.00 /sec
         - OverallThroughput: 59.62 MB/sec
         - RowsReturned: 1.02K (1024)
         - SerializeBatchTime: 999.990us
         - TransmitDataRPCTime: 0.000ns
         - UncompressedRowBatchSize: 185.83 KB (190290)
      HASH_JOIN_NODE (id=2):(Total: 3s001ms, non-child: 42.999ms, % non-child: 1.43%)
         - BuildPartitionTime: 1.999ms
         - BuildRows: 9.15K (9153)
         - BuildRowsPartitioned: 9.15K (9153)
         - BuildTime: 0.000ns
         - GetNewBlockTime: 0.000ns
         - HashBuckets: 16.38K (16384)
         - HashCollisions: 0 (0)
         - LargestPartitionPercent: 6 (6)
         - MaxPartitionLevel: 0 (0)
         - NumRepartitions: 0 (0)
         - PartitionsCreated: 16 (16)
         - PeakMemoryUsage: 3.29 MB (3445888)
         - PinTime: 0.000ns
         - ProbeRows: 1.02K (1024)
         - ProbeRowsPartitioned: 0 (0)
         - ProbeTime: 0.000ns
         - RowsReturned: 0 (0)
         - RowsReturnedRate: 0
         - SpilledPartitions: 0 (0)
         - UnpinTime: 0.000ns
        EXCHANGE_NODE (id=3):(Total: 2s941ms, non-child: 2s941ms, % non-child: 100.00%)
           - BytesReceived: 314.85 KB (322407)
           - ConvertRowBatchTime: 0.000ns
           - DeserializeRowBatchTimer: 0.000ns
           - FirstBatchArrivalWaitTime: 0.000ns
           - PeakMemoryUsage: 0
           - RowsReturned: 9.15K (9153)
           - RowsReturnedRate: 3.11 K/sec
           - SendersBlockedTimer: 0.000ns
           - SendersBlockedTotalTimer(*): 0.000ns
      HDFS_SCAN_NODE (id=1):(Total: 16.999ms, non-child: 16.999ms, % non-child: 100.00%)
         - AverageHdfsReadThreadConcurrency: 0.00
         - AverageScannerThreadConcurrency: 0.00
         - BytesRead: 128.51 KB (131593)
         - BytesReadDataNodeCache: 0
         - BytesReadLocal: 0
         - BytesReadRemoteUnexpected: 0
         - BytesReadShortCircuit: 0
         - DecompressionTime: 0.000ns
         - MaxCompressedTextFileLength: 0
         - NumColumns: 49 (49)
         - NumDisksAccessed: 0 (0)
         - NumRowGroups: 1 (1)
         - NumScannerThreadsStarted: 1 (1)
         - PeakMemoryUsage: 1.78 MB (1866400)
         - PerReadThreadRawHdfsThroughput: 4.18 MB/sec
         - RemoteScanRanges: 0 (0)
         - RowsRead: 1.50K (1500)
         - RowsReturned: 1.02K (1024)
         - RowsReturnedRate: 60.23 K/sec
         - ScanRangesComplete: 1 (1)
         - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
         - ScannerThreadsTotalWallClockTime: 14.999ms
           - MaterializeTupleTime(*): 999.990us
           - ScannerThreadsSysTime: 0.000ns
           - ScannerThreadsUserTime: 2.216ms
         - ScannerThreadsVoluntaryContextSwitches: 27 (27)
         - TotalRawHdfsReadTime(*): 29.999ms
         - TotalReadThroughput: 0.00 /sec
        Filter 0 (1.00 MB):
           - Rows processed: 1.50K (1500)
           - Rows rejected: 3 (3)
           - Rows total: 1.50K (1500)
    Averaged Fragment F01:(Total: 3s191ms, non-child: 160.998ms, % non-child: 5.04%)
      split sizes:  min: 171.69 MB, max: 171.69 MB, avg: 171.69 MB, stddev: 0
      completion times: min:3s210ms  max:3s210ms  mean: 3s210ms  stddev:0.000ns
      execution rates: min:53.47 MB/sec  max:53.47 MB/sec  mean:53.47 MB/sec  stddev:0.00 /sec
      num instances: 1
       - AverageThreadTokens: 1.86
       - BloomFilterBytes: 0
       - PeakMemoryUsage: 106.05 MB (111206232)
       - PerHostPeakMemUsage: 108.65 MB (113924736)
       - PrepareTime: 33.999ms
       - RowsProduced: 9.15K (9153)
       - TotalCpuTime: 6s330ms
       - TotalNetworkReceiveTime: 0.000ns
       - TotalNetworkSendTime: 0.000ns
       - TotalStorageWaitTime: 36.999ms
      CodeGen:(Total: 51.999ms, non-child: 51.999ms, % non-child: 100.00%)
         - CodegenTime: 999.990us
         - CompileTime: 5.999ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 1.86 MB (1953028)
         - NumFunctions: 13 (13)
         - NumInstructions: 228 (228)
         - OptimizationTime: 11.999ms
         - PrepareTime: 33.999ms
      DataStreamSender (dst_id=3):(Total: 4.999ms, non-child: 4.999ms, % non-child: 100.00%)
         - BytesSent: 314.85 KB (322407)
         - NetworkThroughput(*): 153.74 MB/sec
         - OverallThroughput: 61.49 MB/sec
         - RowsReturned: 9.15K (9153)
         - SerializeBatchTime: 2.999ms
         - TransmitDataRPCTime: 1.999ms
         - UncompressedRowBatchSize: 769.68 KB (788150)
      HDFS_SCAN_NODE (id=0):(Total: 2s973ms, non-child: 2s973ms, % non-child: 100.00%)
         - AverageHdfsReadThreadConcurrency: 0.00
         - AverageScannerThreadConcurrency: 1.00
         - BytesRead: 171.79 MB (180132958)
         - BytesReadDataNodeCache: 0
         - BytesReadLocal: 0
         - BytesReadRemoteUnexpected: 0
         - BytesReadShortCircuit: 0
         - DecompressionTime: 130.998ms
         - MaxCompressedTextFileLength: 0
         - NumColumns: 8 (8)
         - NumDisksAccessed: 1 (1)
         - NumRowGroups: 1 (1)
         - NumScannerThreadsStarted: 1 (1)
         - PeakMemoryUsage: 106.05 MB (111196408)
         - PerReadThreadRawHdfsThroughput: 434.91 MB/sec
         - RemoteScanRanges: 0 (0)
         - RowsRead: 28.05M (28047320)
         - RowsReturned: 9.15K (9153)
         - RowsReturnedRate: 3.08 K/sec
         - ScanRangesComplete: 1 (1)
         - ScannerThreadsInvoluntaryContextSwitches: 112 (112)
         - ScannerThreadsTotalWallClockTime: 3s157ms
           - MaterializeTupleTime(*): 2s977ms
           - ScannerThreadsSysTime: 566.243ms
           - ScannerThreadsUserTime: 2s525ms
         - ScannerThreadsVoluntaryContextSwitches: 100 (100)
         - TotalRawHdfsReadTime(*): 394.996ms
         - TotalReadThroughput: 57.11 MB/sec
    Fragment F00:
      Instance dc4c084615fbf9bb:4261466f00000001 (host=mapr1:22000):(Total: 3s001ms, non-child: 0.000ns, % non-child: 0.00%)
        Hdfs split stats (<volume id>:<# splits>/<split lengths>): -1:1/66.83 KB
        Filter 0 arrival: 3s210ms
        MemoryUsage(500.000ms): 0, 3.05 MB, 3.14 MB, 3.13 MB, 3.09 MB, 3.12 MB, 3.19 MB
        ThreadUsage(500.000ms): 1, 2, 2, 2, 2, 2, 2
         - AverageThreadTokens: 1.86
         - BloomFilterBytes: 1.00 MB (1048576)
         - PeakMemoryUsage: 5.07 MB (5320864)
         - PerHostPeakMemUsage: 108.65 MB (113924736)
         - PrepareTime: 38.999ms
         - RowsProduced: 1.02K (1024)
         - TotalCpuTime: 3s232ms
         - TotalNetworkReceiveTime: 2s940ms
         - TotalNetworkSendTime: 0.000ns
         - TotalStorageWaitTime: 13.999ms
        CodeGen:(Total: 262.997ms, non-child: 262.997ms, % non-child: 100.00%)
           - CodegenTime: 999.990us
           - CompileTime: 73.999ms
           - LoadTime: 0.000ns
           - ModuleBitcodeSize: 1.86 MB (1953028)
           - NumFunctions: 85 (85)
           - NumInstructions: 2.86K (2857)
           - OptimizationTime: 151.998ms
           - PrepareTime: 36.999ms
        DataStreamSender (dst_id=4):(Total: 999.990us, non-child: 999.990us, % non-child: 100.00%)
           - BytesSent: 61.05 KB (62513)
           - NetworkThroughput(*): 0.00 /sec
           - OverallThroughput: 59.62 MB/sec
           - RowsReturned: 1.02K (1024)
           - SerializeBatchTime: 999.990us
           - TransmitDataRPCTime: 0.000ns
           - UncompressedRowBatchSize: 185.83 KB (190290)
        HASH_JOIN_NODE (id=2):(Total: 3s001ms, non-child: 42.999ms, % non-child: 1.43%)
          ExecOption: Build Side Codegen Enabled, Probe Side Codegen Enabled, Hash Table Construction Codegen Enabled, Join Build-Side Prepared Asynchronously, 1 of 1 Runtime Filter Published
           - BuildPartitionTime: 1.999ms
           - BuildRows: 9.15K (9153)
           - BuildRowsPartitioned: 9.15K (9153)
           - BuildTime: 0.000ns
           - GetNewBlockTime: 0.000ns
           - HashBuckets: 16.38K (16384)
           - HashCollisions: 0 (0)
           - LargestPartitionPercent: 6 (6)
           - MaxPartitionLevel: 0 (0)
           - NumRepartitions: 0 (0)
           - PartitionsCreated: 16 (16)
           - PeakMemoryUsage: 3.29 MB (3445888)
           - PinTime: 0.000ns
           - ProbeRows: 1.02K (1024)
           - ProbeRowsPartitioned: 0 (0)
           - ProbeTime: 0.000ns
           - RowsReturned: 0 (0)
           - RowsReturnedRate: 0
           - SpilledPartitions: 0 (0)
           - UnpinTime: 0.000ns
          EXCHANGE_NODE (id=3):(Total: 2s941ms, non-child: 999.990us, % non-child: 0.03%)
            BytesReceived(500.000ms): 70.08 KB, 127.46 KB, 162.66 KB, 230.08 KB, 301.42 KB, 312.19 KB
             - BytesReceived: 314.85 KB (322407)
             - ConvertRowBatchTime: 0.000ns
             - DeserializeRowBatchTimer: 0.000ns
             - FirstBatchArrivalWaitTime: 0.000ns
             - PeakMemoryUsage: 0
             - RowsReturned: 9.15K (9153)
             - RowsReturnedRate: 3.11 K/sec
             - SendersBlockedTimer: 0.000ns
             - SendersBlockedTotalTimer(*): 0.000ns
        HDFS_SCAN_NODE (id=1):(Total: 16.999ms, non-child: 16.999ms, % non-child: 100.00%)
          ExecOption: Expr Evaluation Codegen Disabled, PARQUET Codegen Enabled
          Hdfs split stats (<volume id>:<# splits>/<split lengths>): -1:1/66.83 KB
          Runtime filters: All filters arrived. Waited 0
          BytesRead(500.000ms): 0, 0, 0, 0, 0, 0
           - AverageHdfsReadThreadConcurrency: 0.00
           - AverageScannerThreadConcurrency: 0.00
           - BytesRead: 128.51 KB (131593)
           - BytesReadDataNodeCache: 0
           - BytesReadLocal: 0
           - BytesReadRemoteUnexpected: 0
           - BytesReadShortCircuit: 0
           - DecompressionTime: 0.000ns
           - MaxCompressedTextFileLength: 0
           - NumColumns: 49 (49)
           - NumDisksAccessed: 0 (0)
           - NumRowGroups: 1 (1)
           - NumScannerThreadsStarted: 1 (1)
           - PeakMemoryUsage: 1.78 MB (1866400)
           - PerReadThreadRawHdfsThroughput: 4.18 MB/sec
           - RemoteScanRanges: 0 (0)
           - RowsRead: 1.50K (1500)
           - RowsReturned: 1.02K (1024)
           - RowsReturnedRate: 60.23 K/sec
           - ScanRangesComplete: 1 (1)
           - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
           - ScannerThreadsTotalWallClockTime: 14.999ms
             - MaterializeTupleTime(*): 999.990us
             - ScannerThreadsSysTime: 0.000ns
             - ScannerThreadsUserTime: 2.216ms
           - ScannerThreadsVoluntaryContextSwitches: 27 (27)
           - TotalRawHdfsReadTime(*): 29.999ms
           - TotalReadThroughput: 0.00 /sec
          Filter 0 (1.00 MB):
             - Rows processed: 1.50K (1500)
             - Rows rejected: 3 (3)
             - Rows total: 1.50K (1500)
    Fragment F01:
      Instance dc4c084615fbf9bb:4261466f00000002 (host=mapr1:22000):(Total: 3s191ms, non-child: 160.998ms, % non-child: 5.04%)
        Hdfs split stats (<volume id>:<# splits>/<split lengths>): -1:1/171.69 MB
        MemoryUsage(500.000ms): 0, 91.50 MB, 82.91 MB, 86.63 MB, 67.68 MB, 66.67 MB, 53.51 MB
        ThreadUsage(500.000ms): 1, 2, 2, 2, 2, 2, 2
         - AverageThreadTokens: 1.86
         - BloomFilterBytes: 0
         - PeakMemoryUsage: 106.05 MB (111206232)
         - PerHostPeakMemUsage: 108.65 MB (113924736)
         - PrepareTime: 33.999ms
         - RowsProduced: 9.15K (9153)
         - TotalCpuTime: 6s330ms
         - TotalNetworkReceiveTime: 0.000ns
         - TotalNetworkSendTime: 0.000ns
         - TotalStorageWaitTime: 36.999ms
        CodeGen:(Total: 51.999ms, non-child: 51.999ms, % non-child: 100.00%)
           - CodegenTime: 999.990us
           - CompileTime: 5.999ms
           - LoadTime: 0.000ns
           - ModuleBitcodeSize: 1.86 MB (1953028)
           - NumFunctions: 13 (13)
           - NumInstructions: 228 (228)
           - OptimizationTime: 11.999ms
           - PrepareTime: 33.999ms
        DataStreamSender (dst_id=3):(Total: 4.999ms, non-child: 4.999ms, % non-child: 100.00%)
           - BytesSent: 314.85 KB (322407)
           - NetworkThroughput(*): 153.74 MB/sec
           - OverallThroughput: 61.49 MB/sec
           - RowsReturned: 9.15K (9153)
           - SerializeBatchTime: 2.999ms
           - TransmitDataRPCTime: 1.999ms
           - UncompressedRowBatchSize: 769.68 KB (788150)
        HDFS_SCAN_NODE (id=0):(Total: 2s973ms, non-child: 2s973ms, % non-child: 100.00%)
          ExecOption: Expr Evaluation Codegen Enabled, PARQUET Codegen Enabled, Codegen enabled: 1 out of 1
          Hdfs split stats (<volume id>:<# splits>/<split lengths>): -1:1/171.69 MB
          Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0% 4:0% 5:0%
          File Formats: PARQUET/SNAPPY:8
          BytesRead(500.000ms): 88.45 MB, 112.45 MB, 136.45 MB, 152.66 MB, 168.66 MB, 171.79 MB
           - AverageHdfsReadThreadConcurrency: 0.00
           - AverageScannerThreadConcurrency: 1.00
           - BytesRead: 171.79 MB (180132958)
           - BytesReadDataNodeCache: 0
           - BytesReadLocal: 0
           - BytesReadRemoteUnexpected: 0
           - BytesReadShortCircuit: 0
           - DecompressionTime: 130.998ms
           - MaxCompressedTextFileLength: 0
           - NumColumns: 8 (8)
           - NumDisksAccessed: 1 (1)
           - NumRowGroups: 1 (1)
           - NumScannerThreadsStarted: 1 (1)
           - PeakMemoryUsage: 106.05 MB (111196408)
           - PerReadThreadRawHdfsThroughput: 434.91 MB/sec
           - RemoteScanRanges: 0 (0)
           - RowsRead: 28.05M (28047320)
           - RowsReturned: 9.15K (9153)
           - RowsReturnedRate: 3.08 K/sec
           - ScanRangesComplete: 1 (1)
           - ScannerThreadsInvoluntaryContextSwitches: 112 (112)
           - ScannerThreadsTotalWallClockTime: 3s157ms
             - MaterializeTupleTime(*): 2s977ms
             - ScannerThreadsSysTime: 566.243ms
             - ScannerThreadsUserTime: 2s525ms
           - ScannerThreadsVoluntaryContextSwitches: 100 (100)
           - TotalRawHdfsReadTime(*): 394.996ms
           - TotalReadThroughput: 57.11 MB/sec
Cloudera Employee
Posts: 258
Registered: ‎07-29-2015

Re: Imapla query is slow, where is the bottleneck?

In the summary you can see that the average time in Node 0 is 2s973ms.The time in scan nodes is the time spent waiting for scanner threads to return a row batch. That looks like the bottleneck.

 

Drilling down into the profile for that node (HDFS_SCAN_NODE (id=0):(Total: 2s973ms, non-child: 2s973ms, % non-child: 100.00%)) we can see that it scanned a single Parquet file and spent almost all of that time materializing tuples (MaterializeTupleTime).

 

So there are two things going on:

  • Tuple materialisation is expensive.
  • There's no parallelism available because there's only a single parquet file in the partition. Usually for large tables having large files is a good thing, because there's less overhead, but in this case you're losing out on some parallelism.

We have a lot of changes in the pipeline at various stages to improve the first part - either by making it more efficient or by getting better at skipping over data. There's lots of room for improvement. Unfortunately that doesn't help you immediately.

Explorer
Posts: 9
Registered: ‎01-25-2017

Re: Imapla query is slow, where is the bottleneck?

Thank you sir for helping me walk through the profile. This is very informative. 

Announcements