11-28-2017 03:57 PM - last edited on 11-29-2017 05:54 AM by cjervis
Is it possible to make Impala default to 'name' instead of 'position' so I don't have to do this every Hue session?
My parquet files don't always have the same set of columns, so I have to lookup columns by name.
11-28-2017 04:14 PM - edited 11-28-2017 04:21 PM
You can change the default query options via impalad command line options:
The same can be done in Cloudera Manager: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/impala_config_options.html
11-28-2017 04:32 PM
11-28-2017 05:00 PM
It's generally safe to use name-based resolution by default. Performance should be about the same. I agree name-based resolution may be a better choice because it's more intuitive.
Index vs. name based resolution have different tradeoffs in terms of what schema-evolution operations are allowed. For example with index-based resolution you can safely rename a column in your table schema. With name based resolution you can safely add/drop columns in the middle of your table schema, whereas with index-based resolution you can generally only add new columns at the end. So it's really all about tradeoffs.