Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Avro: Schema evolution

Highlighted

Avro: Schema evolution

Rising Star

A hive table (of AvroSerde) is associated with a static schema file (.avsc). In the event there are data files of varying schema, the hive query parsing fails.

Option 1:

------------

Whenever there is a change in schema, the current and the new schema can be compared and the schema can be manually edited. For e.g. we can add the default value for the new column being added. However, this is a manual process. Is it feasible to perform this comparison automatically (custom coding) and evolve the schema file?

Option 2:

------------

As we know each Avro file contains both schema and the data, can we use the custom serde to extend Hive class and read the data from HDFS and parse it, relay the results back to Hive. Is this approach feasible?