Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to create impala derived tables

avatar
Explorer

I read in a presentation that derived tables will be available in v > 2.4, I am using 2.5 with quick start but it seems that the CREATE DERIVED TABLE keyword is  not supported???

1 ACCEPTED SOLUTION

avatar

The only way to do this with zero work would be to use a view. http://www.cloudera.com/documentation/enterprise/latest/topics/impala_create_view.html 

 

Otherwise you do have to run the queries as part of your data pipeline as you mentioned.

 

 

View solution in original post

5 REPLIES 5

avatar

That feature has not made it in unfortunately. The documentation at http://www.cloudera.com/documentation.html is the source of truth about what features are or are not present.

avatar
Explorer

Thank you Tim for the quick reply.

When can we expect the feature to be available? this is my best selling point of Cloudera option for the team

avatar

I don't think it's on the immediate roadmap, our focus recently has been on various other things (performance, Amazon EC2 support, etc)

avatar
Explorer

Hi Tim

 

That is unfortunate.

 

However,

 

I am creating external tables with configured locations and that is great, because now the only thing I need to do to push new daily data is to just create the parquet file and alter the table with the new partition details.

 

Now the challenge is the aggregated tables that are created as CREATE TABLE FOO AS SELECT..FROM ABC, XYZ,..WHERE...;

 

These ones I have to recreate them at each time and there is no way to just alter them and make them aware of the new data. The queries are evaluated at creation time!

 

Is there any other way to create aggregated tables that are updated automatically?

 

Cheers Khalef. 

avatar

The only way to do this with zero work would be to use a view. http://www.cloudera.com/documentation/enterprise/latest/topics/impala_create_view.html 

 

Otherwise you do have to run the queries as part of your data pipeline as you mentioned.