Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hash join node no-child time is much bigger than the sum of all breakdown costs

Solved Go to solution

Hash join node no-child time is much bigger than the sum of all breakdown costs

Explorer

Hi

Can any one help to explain what's going on in this hash join node. The non-child time is 5s326ms. However, if I sum up all the breakdown costs, i.e., build time, probe time, build partition time and etc, it is not equal to the execution time of this node, i.e., non-child time. So what part of the execution time is missing from this profile?

 

The full profile is attached in the link below. We are using CDH4.5.8 Impala 2.2.

 

HASH_JOIN_NODE (id=2):(Total: 8s009ms, non-child: 5s326ms, % non-child: 66.51%)
ExecOption: Build Side Codegen Enabled, Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously
- BuildPartitionTime: 246.892ms
- BuildRows: 1.50M (1504350)
- BuildRowsPartitioned: 1.50M (1504350)
- BuildTime: 205.22ms
- GetNewBlockTime: 2.107ms
- HashBuckets: 4.19M (4194304)
- LargestPartitionPercent: 6 (6)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 451.02 MB (472932352)
- PinTime: 0ns
- ProbeRows: 2.50M (2502844)
- ProbeRowsPartitioned: 0 (0)
- ProbeTime: 860.379ms
- RowsReturned: 417.52K (417520)
- RowsReturnedRate: 52.13 K/sec
- SpilledPartitions: 0 (0)
- UnpinTime: 997ns

 

https://my.syncplicity.com/share/knuknsvjyz1kzyu/profile

password: 123456

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor

This is apperently a bug that has been fixed in more recent versions:

 

https://issues.cloudera.org/browse/IMPALA-2075

3 REPLIES 3
Highlighted

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor

This is apperently a bug that has been fixed in more recent versions:

 

https://issues.cloudera.org/browse/IMPALA-2075

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Explorer

Thanks. Does this affect the final execution time listed in the Timeline part of the profile, do I have to substract the additional time recored in the hash join part to get the correct query execution time?

 

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor
No, it does not affect the timeline.