Reply
Explorer
Posts: 11
Registered: ‎01-26-2016
Accepted Solution

Hash join node no-child time is much bigger than the sum of all breakdown costs

Hi

Can any one help to explain what's going on in this hash join node. The non-child time is 5s326ms. However, if I sum up all the breakdown costs, i.e., build time, probe time, build partition time and etc, it is not equal to the execution time of this node, i.e., non-child time. So what part of the execution time is missing from this profile?

 

The full profile is attached in the link below. We are using CDH4.5.8 Impala 2.2.

 

HASH_JOIN_NODE (id=2):(Total: 8s009ms, non-child: 5s326ms, % non-child: 66.51%)
ExecOption: Build Side Codegen Enabled, Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously
- BuildPartitionTime: 246.892ms
- BuildRows: 1.50M (1504350)
- BuildRowsPartitioned: 1.50M (1504350)
- BuildTime: 205.22ms
- GetNewBlockTime: 2.107ms
- HashBuckets: 4.19M (4194304)
- LargestPartitionPercent: 6 (6)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 451.02 MB (472932352)
- PinTime: 0ns
- ProbeRows: 2.50M (2502844)
- ProbeRowsPartitioned: 0 (0)
- ProbeTime: 860.379ms
- RowsReturned: 417.52K (417520)
- RowsReturnedRate: 52.13 K/sec
- SpilledPartitions: 0 (0)
- UnpinTime: 997ns

 

https://my.syncplicity.com/share/knuknsvjyz1kzyu/profile

password: 123456

Highlighted
Cloudera Employee
Posts: 35
Registered: ‎10-20-2015

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

This is apperently a bug that has been fixed in more recent versions:

 

https://issues.cloudera.org/browse/IMPALA-2075

Explorer
Posts: 11
Registered: ‎01-26-2016

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Thanks. Does this affect the final execution time listed in the Timeline part of the profile, do I have to substract the additional time recored in the hash join part to get the correct query execution time?

 

Cloudera Employee
Posts: 35
Registered: ‎10-20-2015

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

No, it does not affect the timeline.