Expert Contributor
Posts: 162
Registered: ‎07-29-2013
Accepted Solution

Impala and virtual columns

Hi, imagine I do have a table:

CREATE TABLE partitioned_table(....)

PARTITIONED BY (fulldate String)


And a query:


 select distinct(fulldate) from partitioned_table order by fulldate desc limit 100;


What would impala do? Only "virtual" (partition???) column takes place in query. Therese no need to fetch HDFS data.

It looks like right now Impala does read ALL partitions and calculated DISTINCT for a virtual column (virtual=is not present in data, this is metadata-only column)


Cloudera Employee
Posts: 16
Registered: ‎08-01-2013

Re: Impala and virtual columns

Sounds like something that could be done, so I've added a JIRA to track it.