Support Questions

Find answers, ask questions, and share your expertise

Hive Explain Plan Predicate Question

New Contributor

Could you remind me whats going on here in this example explain plan. The table contains 611 rows, which I see are being read. Then it appears as the key is not null predicate is applied and the number of rows drops to 306. There are no null fields in this dataset.

How is this pruning data? I would have expected that to be the same as the input size.

Map Operator Tree:

TableScan

alias: a

filterExpr: key is not null (type: boolean)

Statistics: Num rows: 611 Data size: 1833 Basic stats: COMPLETE Column stats: NONE

Filter Operator

predicate: key is not null (type: boolean)

Statistics: Num rows: 306 Data size: 918 Basic stats: COMPLETE Column stats: NONE

Reduce Output Operator

key expressions: key (type: string)

sort order: +

Map-reduce partition columns: key (type: string)

Statistics: Num rows: 306 Data size: 918 Basic stats: COMPLETE Column stats: NONE

1 ACCEPTED SOLUTION

Cloudera Employee

These numbers (Num rows, Data size) are estimated by Hive (optimizer) and do not represent actual numbers. You can run EXPLAIN + ANALYZE to see both Estimated and Actual numbers.

View solution in original post

2 REPLIES 2

Cloudera Employee

These numbers (Num rows, Data size) are estimated by Hive (optimizer) and do not represent actual numbers. You can run EXPLAIN + ANALYZE to see both Estimated and Actual numbers.

New Contributor

Thanks! This is exactly what I was looking for.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.