Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hash join node no-child time is much bigger than the sum of all breakdown costs

Solved Go to solution
Highlighted

Hash join node no-child time is much bigger than the sum of all breakdown costs

Explorer

Hi

Can any one help to explain what's going on in this hash join node. The non-child time is 5s326ms. However, if I sum up all the breakdown costs, i.e., build time, probe time, build partition time and etc, it is not equal to the execution time of this node, i.e., non-child time. So what part of the execution time is missing from this profile?

 

The full profile is attached in the link below. We are using CDH4.5.8 Impala 2.2.

 

HASH_JOIN_NODE (id=2):(Total: 8s009ms, non-child: 5s326ms, % non-child: 66.51%)
ExecOption: Build Side Codegen Enabled, Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously
- BuildPartitionTime: 246.892ms
- BuildRows: 1.50M (1504350)
- BuildRowsPartitioned: 1.50M (1504350)
- BuildTime: 205.22ms
- GetNewBlockTime: 2.107ms
- HashBuckets: 4.19M (4194304)
- LargestPartitionPercent: 6 (6)
- MaxPartitionLevel: 0 (0)
- NumRepartitions: 0 (0)
- PartitionsCreated: 16 (16)
- PeakMemoryUsage: 451.02 MB (472932352)
- PinTime: 0ns
- ProbeRows: 2.50M (2502844)
- ProbeRowsPartitioned: 0 (0)
- ProbeTime: 860.379ms
- RowsReturned: 417.52K (417520)
- RowsReturnedRate: 52.13 K/sec
- SpilledPartitions: 0 (0)
- UnpinTime: 997ns

 

https://my.syncplicity.com/share/knuknsvjyz1kzyu/profile

password: 123456

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor

This is apperently a bug that has been fixed in more recent versions:

 

https://issues.cloudera.org/browse/IMPALA-2075

View solution in original post

3 REPLIES 3
Highlighted

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor

This is apperently a bug that has been fixed in more recent versions:

 

https://issues.cloudera.org/browse/IMPALA-2075

View solution in original post

Highlighted

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Explorer

Thanks. Does this affect the final execution time listed in the Timeline part of the profile, do I have to substract the additional time recored in the hash join part to get the correct query execution time?

 

Re: Hash join node no-child time is much bigger than the sum of all breakdown costs

Contributor
No, it does not affect the timeline.
Don't have an account?
Coming from Hortonworks? Activate your account here