Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

External Parquet table

Highlighted

External Parquet table

Explorer

If I have a process writing Parquet files to a location in HDFS, how can I create an external Impala table that uses these files?  How do I refrence the schema that is contained within these files?

 

For example, if my parquet file contains 'State' and 'Population', would I need to create columns in Impala called 'State' and 'Population' or could I create and column name and the data is just used in the same order?  Ex: 

create external table parquet_table_name (x STRING, y INT) LOCATION '/user/testuser/data';
1 REPLY 1

Re: External Parquet table

Contributor

Yes, you will need to create the table with the identical schema as stored in the parquet files.

 

In the upcoming release, we augmented the create table stmt to populate the schema from an

existing parquet file.