11-01-2016 09:31 PM
quick question on performance, if I have 2 tables, the first one with columns "a,b" and the second one with columns "c,d" and I create a view like the following :
CREATE VIEW my_view AS ( select a,b,null,null from table_1 union select null,null,c,d from table_2)
Now if I do a simple query like :
select a from my_view
Will the query only read from table 1 or the entire table_2 will also be scanned?
(I am mainly worried about disk reads)
11-02-2016 04:08 AM
you should be able to tell from the query profile. Run the query and then immediately after run "profile;" in the Impala shell to display the profile information, which will also contain information about the table scans. Feel free to post the profile here if you need help inspecting it.
11-02-2016 05:41 PM
both tables have to be scanned to observe SQL semantics. Otherwise, we would be changing the number of results coming out of your view. If you want the drop the second union operand, you could add a "WHERE a IS NOT NULL", and then the seocnd table will not be scanned.