Member since
09-23-2016
4
Posts
0
Kudos Received
0
Solutions
03-06-2017
03:04 AM
Thanks EricL. Plenty to dig into with these logs, quick wins so far: select a, b, c from database.view is slow as it runs a describe statement prior to SQL execution. select a, b, c from table is slow as it runs a describe statement prior to SQL execution, despite having the database name in my connection URL. select a, b, c from database.table is faster as no additional describe statement is executed, but still a good bit slower than simply executed it in Impala-Shell. I'm going to dig into the additional JDBC overhead. EDIT: A show tables statement is executed everytime prior to the SQL execution, adding a 1 sec delay.
... View more
03-06-2017
02:05 AM
I got these from Cloudera Manager. JDBC: Query (id=3e40a60f2dfc2554:bac26d70201f8d9c)
Summary
Session ID: f042aa249717205e:ae98b27fcfcd0783
Session Type: HIVESERVER2
HiveServer2 Protocol Version: V6
Start Time: 2017-03-03 11:38:07.269796000
End Time: 2017-03-03 11:38:08.508511000
Query Type: QUERY
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.6.0-cdh5.8.2 RELEASE (build f25aa5b2bcdabf1eb4233747a7b04a067059ee3b)
User:
Connected User:
Delegated User:
Network Address: ::ffff:1.2.3.4:50034
Default Db: default
Sql Statement: <QUERY>
Coordinator: server1.example.com:22000
Query Options (non default):
Plan:
----------------
Estimated Per-Host Requirements: Memory=768.00MB VCores=1
01:EXCHANGE [UNPARTITIONED]
| hosts=1 per-host-mem=unavailable
| tuple-ids=0 row-size=614B cardinality=1
|
00:SCAN HDFS [default.impala_table, RANDOM]
partitions=1/13999 files=1 size=132.88KB
predicates: (default.impala_table.acct_id = 660)
table stats: 760514828 rows total
column stats: all
hosts=1 per-host-mem=768.00MB
tuple-ids=0 row-size=614B cardinality=1
----------------
Estimated Per-Host Mem: 805306368
Estimated Per-Host VCores: 1
Request Pool: root.default
Admission result: Admitted immediately
ExecSummary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
----------------------------------------------------------------------------------------------------------------------
01:EXCHANGE 1 0.000ns 0.000ns 1 1 0 -1.00 B UNPARTITIONED
00:SCAN HDFS 1 17.999ms 17.999ms 1 1 1.56 MB 768.00 MB default.impala_table
Planner Timeline
Analysis finished: 9883819
Equivalence classes computed: 13302816
Single node plan created: 16281907
Runtime filters computed: 16321655
Distributed plan created: 18076216
Lineage info computed: 19909717
Planning finished: 26827613
Query Timeline
Start execution: 0
Planning finished: 30000364
Submit for admission: 31000376
Completed admission: 32000388
Ready to start 1 remote fragments: 33000400
All 1 remote fragments started: 39000472
Rows available: 54000656
First row fetched: 759009252
Unregister query: 1239015104
ImpalaServer
- ClientFetchWaitTimer: 1185014448
- InactiveTotalTime: 0
- RowMaterializationTimer: 0
- TotalTime: 0
Execution Profile 3e40a60f2dfc2554:bac26d70201f8d9c
Number of filters: 0
Filter routing table:
ID Src. Node Tgt. Node(s) Targets Target type Partition filter Pending (Expected) First arrived Completed
-------------------------------------------------------------------------------------------------------------------
Fragment start latencies: Count: 1, 25th %-ile: 5ms, 50th %-ile: 5ms, 75th %-ile: 5ms, 90th %-ile: 5ms, 95th %-ile: 5ms, 99.9th %-ile: 5ms
Per Node Peak Memory Usage: server8.example.com:22000(1.57 MB) server1.example.com:22000(0)
- FiltersReceived: 0
- FinalizationTimer: 0
- InactiveTotalTime: 0
- TotalTime: 22000268
Coordinator Fragment F01
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 12296
- PerHostPeakMemUsage: 0
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 712008680
- TotalNetworkReceiveTime: 15000184
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 0
- TotalTime: 16000196
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
EXCHANGE_NODE (id=1)
- BytesReceived: 441
- ConvertRowBatchTime: 0
- DeserializeRowBatchTimer: 0
- FirstBatchArrivalWaitTime: 15000184
- InactiveTotalTime: 15000184
- PeakMemoryUsage: 0
- RowsReturned: 1
- RowsReturnedRate: 66
- SendersBlockedTimer: 0
- SendersBlockedTotalTimer(*): 0
- TotalTime: 15000184
Averaged Fragment F00
split sizes: min: 132.88 KB, max: 132.88 KB, avg: 132.88 KB, stddev: 0
completion times: min:17.000ms max:17.000ms mean: 17.000ms stddev:0.000ns
execution rates: min:7.63 MB/sec max:7.63 MB/sec mean:7.63 MB/sec stddev:0.00 /sec
num instances: 1
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 1644840
- PerHostPeakMemUsage: 1644840
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 22999884
- TotalNetworkReceiveTime: 0
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 16999908
- TotalTime: 20999892
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
DataStreamSender (dst_id=1)
- BytesSent: 441
- InactiveTotalTime: 0
- NetworkThroughput(*): 0
- OverallThroughput: 441001
- PeakMemoryUsage: 296
- RowsReturned: 1
- SerializeBatchTime: 999996
- TotalTime: 999996
- TransmitDataRPCTime: 0
- UncompressedRowBatchSize: 612
HDFS_SCAN_NODE (id=0)
- AverageHdfsReadThreadConcurrency: 0.0
- AverageScannerThreadConcurrency: 0.0
- BytesRead: 199843
- BytesReadDataNodeCache: 0
- BytesReadLocal: 199843
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 199843
- DecompressionTime: 0
- InactiveTotalTime: 0
- MaxCompressedTextFileLength: 0
- NumColumns: 48
- NumDisksAccessed: 1
- NumRowGroups: 1
- NumScannerThreadsStarted: 1
- PeakMemoryUsage: 1636352
- PerReadThreadRawHdfsThroughput: 0
- RemoteScanRanges: 0
- RowsRead: 760
- RowsReturned: 1
- RowsReturnedRate: 55
- ScanRangesComplete: 1
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 18999900
- MaterializeTupleTime(*): 0
- ScannerThreadsSysTime: 0
- ScannerThreadsUserTime: 2624000
- ScannerThreadsVoluntaryContextSwitches: 30
- TotalRawHdfsReadTime(*): 0
- TotalReadThroughput: 0
- TotalTime: 17999904
Fragment F00
Instance 3e40a60f2dfc2554:bac26d70201f8d9e (host=server8.example.com:22000)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:1/132.88 KB
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 1644840
- PerHostPeakMemUsage: 1644840
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 22999884
- TotalNetworkReceiveTime: 0
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 16999908
- TotalTime: 20999892
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
DataStreamSender (dst_id=1)
- BytesSent: 441
- InactiveTotalTime: 0
- NetworkThroughput(*): 0
- OverallThroughput: 441001
- PeakMemoryUsage: 296
- RowsReturned: 1
- SerializeBatchTime: 999996
- TotalTime: 999996
- TransmitDataRPCTime: 0
- UncompressedRowBatchSize: 612
HDFS_SCAN_NODE (id=0)
ExecOption: Expr Evaluation Codegen Disabled, Codegen enabled: 0 out of 1
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:1/132.88 KB
Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0%
File Formats: PARQUET/SNAPPY:48
- AverageHdfsReadThreadConcurrency: 0.0
- AverageScannerThreadConcurrency: 0.0
- BytesRead: 199843
- BytesReadDataNodeCache: 0
- BytesReadLocal: 199843
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 199843
- DecompressionTime: 0
- InactiveTotalTime: 0
- MaxCompressedTextFileLength: 0
- NumColumns: 48
- NumDisksAccessed: 1
- NumRowGroups: 1
- NumScannerThreadsStarted: 1
- PeakMemoryUsage: 1636352
- PerReadThreadRawHdfsThroughput: 0
- RemoteScanRanges: 0
- RowsRead: 760
- RowsReturned: 1
- RowsReturnedRate: 55
- ScanRangesComplete: 1
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 18999900
- MaterializeTupleTime(*): 0
- ScannerThreadsSysTime: 0
- ScannerThreadsUserTime: 2624000
- ScannerThreadsVoluntaryContextSwitches: 30
- TotalRawHdfsReadTime(*): 0
- TotalReadThroughput: 0
- TotalTime: 17999904 Impala-shell: Query (id=494fccf3229a725c:40217c9c3c50929a)
Summary
Session ID: 454fbcd23796af47:82ab459a4077e5ab
Session Type: BEESWAX
Start Time: 2017-03-03 11:40:15.660573000
End Time: 2017-03-03 11:40:15.806061000
Query Type: QUERY
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.6.0-cdh5.8.2 RELEASE (build f25aa5b2bcdabf1eb4233747a7b04a067059ee3b)
User: impala_user
Connected User: impala_user
Delegated User:
Network Address: ::ffff:4.3.2.1:55960
Default Db: default
Sql Statement: <QUERY>
Coordinator: server1.example.com:22000
Query Options (non default):
Plan:
----------------
Estimated Per-Host Requirements: Memory=768.00MB VCores=1
01:EXCHANGE [UNPARTITIONED]
| hosts=1 per-host-mem=unavailable
| tuple-ids=0 row-size=614B cardinality=1
|
00:SCAN HDFS [default.impala_table, RANDOM]
partitions=1/13999 files=1 size=132.88KB
predicates: (default.impala_table.acct_id = 660)
table stats: 760514828 rows total
column stats: all
hosts=1 per-host-mem=768.00MB
tuple-ids=0 row-size=614B cardinality=1
----------------
Estimated Per-Host Mem: 805306368
Estimated Per-Host VCores: 1
Request Pool: root.impala_user
Admission result: Admitted immediately
ExecSummary:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
----------------------------------------------------------------------------------------------------------------------
01:EXCHANGE 1 0.000ns 0.000ns 1 1 0 -1.00 B UNPARTITIONED
00:SCAN HDFS 1 23.999ms 23.999ms 1 1 1.56 MB 768.00 MB default.impala_table
Planner Timeline
Analysis finished: 9689649
Equivalence classes computed: 15036979
Single node plan created: 19349122
Runtime filters computed: 19426476
Distributed plan created: 21875782
Lineage info computed: 24636383
Planning finished: 33188255
Query Timeline
Start execution: 0
Planning finished: 36000448
Submit for admission: 37000464
Completed admission: 37000464
Ready to start 1 remote fragments: 38000476
All 1 remote fragments started: 40000500
Rows available: 64000800
First row fetched: 142001772
Unregister query: 145001808
ImpalaServer
- ClientFetchWaitTimer: 81001008
- InactiveTotalTime: 0
- RowMaterializationTimer: 0
- TotalTime: 0
Execution Profile 494fccf3229a725c:40217c9c3c50929a
Number of filters: 0
Filter routing table:
ID Src. Node Tgt. Node(s) Targets Target type Partition filter Pending (Expected) First arrived Completed
-------------------------------------------------------------------------------------------------------------------
Fragment start latencies: Count: 1, 25th %-ile: 1ms, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 1ms
Per Node Peak Memory Usage: server8.example.com:22000(1.57 MB) server1.example.com:22000(0)
- FiltersReceived: 0
- FinalizationTimer: 0
- InactiveTotalTime: 0
- TotalTime: 27000336
Coordinator Fragment F01
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 12296
- PerHostPeakMemUsage: 0
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 80000996
- TotalNetworkReceiveTime: 24000300
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 0
- TotalTime: 24000300
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
EXCHANGE_NODE (id=1)
- BytesReceived: 441
- ConvertRowBatchTime: 0
- DeserializeRowBatchTimer: 0
- FirstBatchArrivalWaitTime: 24000300
- InactiveTotalTime: 24000300
- PeakMemoryUsage: 0
- RowsReturned: 1
- RowsReturnedRate: 41
- SendersBlockedTimer: 0
- SendersBlockedTotalTimer(*): 0
- TotalTime: 24000300
Averaged Fragment F00
split sizes: min: 132.88 KB, max: 132.88 KB, avg: 132.88 KB, stddev: 0
completion times: min:26.000ms max:26.000ms mean: 26.000ms stddev:0.000ns
execution rates: min:4.99 MB/sec max:4.99 MB/sec mean:4.99 MB/sec stddev:0.00 /sec
num instances: 1
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 1644840
- PerHostPeakMemUsage: 1644840
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 25999864
- TotalNetworkReceiveTime: 0
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 22999880
- TotalTime: 25999864
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
DataStreamSender (dst_id=1)
- BytesSent: 441
- InactiveTotalTime: 0
- NetworkThroughput(*): 0
- OverallThroughput: 0
- PeakMemoryUsage: 296
- RowsReturned: 1
- SerializeBatchTime: 0
- TotalTime: 0
- TransmitDataRPCTime: 0
- UncompressedRowBatchSize: 612
HDFS_SCAN_NODE (id=0)
- AverageHdfsReadThreadConcurrency: 0.0
- AverageScannerThreadConcurrency: 0.0
- BytesRead: 199843
- BytesReadDataNodeCache: 0
- BytesReadLocal: 199843
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 199843
- DecompressionTime: 0
- InactiveTotalTime: 0
- MaxCompressedTextFileLength: 0
- NumColumns: 48
- NumDisksAccessed: 1
- NumRowGroups: 1
- NumScannerThreadsStarted: 1
- PeakMemoryUsage: 1636352
- PerReadThreadRawHdfsThroughput: 0
- RemoteScanRanges: 0
- RowsRead: 760
- RowsReturned: 1
- RowsReturnedRate: 41
- ScanRangesComplete: 1
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 22999880
- MaterializeTupleTime(*): 0
- ScannerThreadsSysTime: 2775000
- ScannerThreadsUserTime: 0
- ScannerThreadsVoluntaryContextSwitches: 35
- TotalRawHdfsReadTime(*): 0
- TotalReadThroughput: 0
- TotalTime: 23999876
Fragment F00
Instance 494fccf3229a725c:40217c9c3c50929c (host=server8.example.com:22000)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:1/132.88 KB
- AverageThreadTokens: 0.0
- BloomFilterBytes: 0
- InactiveTotalTime: 0
- PeakMemoryUsage: 1644840
- PerHostPeakMemUsage: 1644840
- PrepareTime: 0
- RowsProduced: 1
- TotalCpuTime: 25999864
- TotalNetworkReceiveTime: 0
- TotalNetworkSendTime: 0
- TotalStorageWaitTime: 22999880
- TotalTime: 25999864
BlockMgr
- BlockWritesOutstanding: 0
- BlocksCreated: 0
- BlocksRecycled: 0
- BufferedPins: 0
- BytesWritten: 0
- InactiveTotalTime: 0
- MaxBlockSize: 8388608
- MemoryLimit: 4294967296
- PeakMemoryUsage: 0
- TotalBufferWaitTime: 0
- TotalEncryptionTime: 0
- TotalIntegrityCheckTime: 0
- TotalReadBlockTime: 0
- TotalTime: 0
DataStreamSender (dst_id=1)
- BytesSent: 441
- InactiveTotalTime: 0
- NetworkThroughput(*): 0
- OverallThroughput: 0
- PeakMemoryUsage: 296
- RowsReturned: 1
- SerializeBatchTime: 0
- TotalTime: 0
- TransmitDataRPCTime: 0
- UncompressedRowBatchSize: 612
HDFS_SCAN_NODE (id=0)
ExecOption: Expr Evaluation Codegen Disabled, Codegen enabled: 0 out of 1
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:1/132.88 KB
Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0%
File Formats: PARQUET/SNAPPY:48
- AverageHdfsReadThreadConcurrency: 0.0
- AverageScannerThreadConcurrency: 0.0
- BytesRead: 199843
- BytesReadDataNodeCache: 0
- BytesReadLocal: 199843
- BytesReadRemoteUnexpected: 0
- BytesReadShortCircuit: 199843
- DecompressionTime: 0
- InactiveTotalTime: 0
- MaxCompressedTextFileLength: 0
- NumColumns: 48
- NumDisksAccessed: 1
- NumRowGroups: 1
- NumScannerThreadsStarted: 1
- PeakMemoryUsage: 1636352
- PerReadThreadRawHdfsThroughput: 0
- RemoteScanRanges: 0
- RowsRead: 760
- RowsReturned: 1
- RowsReturnedRate: 41
- ScanRangesComplete: 1
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 22999880
- MaterializeTupleTime(*): 0
- ScannerThreadsSysTime: 2775000
- ScannerThreadsUserTime: 0
- ScannerThreadsVoluntaryContextSwitches: 35
- TotalRawHdfsReadTime(*): 0
- TotalReadThroughput: 0
- TotalTime: 23999876
... View more
03-03-2017
08:58 AM
I ran the same query from impala-shell and DBvisualzer using the latest JDBC 4.1 (2.5.36) and it was 10x slower than the shell. Is there anything I can do to improve the performance of the JDBC connector? Is it anything to do with impala-shell using a Beeswax session type and JDBC using a HiveServer2 session type?
... View more
Labels:
- Labels:
-
Apache Impala