10-18-2013 03:42 AM
Hi, imagine I do have a table:
CREATE TABLE partitioned_table(....)
PARTITIONED BY (fulldate String)
And a query:
select distinct(fulldate) from partitioned_table order by fulldate desc limit 100;
What would impala do? Only "virtual" (partition???) column takes place in query. Therese no need to fetch HDFS data.
It looks like right now Impala does read ALL partitions and calculated DISTINCT for a virtual column (virtual=is not present in data, this is metadata-only column)