Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Predicate Pushdown vs Bloom Filter

Highlighted

Predicate Pushdown vs Bloom Filter

New Contributor

Hi All

While looking for Query optimizations on Big data especially an ORC file , I I came across two possibilities predicate push down and Bloom Filters .

Predicate push down helps us to avoid reading unnecessary stripes, which helps to reduce IO , but to me it appears that Bloom Filter also serves the same purpose except the below.

for predicate push down we do not need to explicitly create any artifacts while writing an ORC file , where as for Bloom filters we need to configure the columns on which we would like to apply filter while writing to ORC file.

Request suggestions to get my understanding better.

Thanks Santosh