If I have a process writing Parquet files to a location in HDFS, how can I create an external Impala table that uses these files? How do I refrence the schema that is contained within these files?
For example, if my parquet file contains 'State' and 'Population', would I need to create columns in Impala called 'State' and 'Population' or could I create and column name and the data is just used in the same order? Ex:
create external table parquet_table_name (x STRING, y INT) LOCATION '/user/testuser/data';