I am trying to validate schema evolution using different formats (ORC, Parquet and AVRO). My source data is CSV and they change when new releases of the applications are deployed (like adding more columns, removing columns, etc). If i load this data into a Hive table as snapshot each day, how could i track these schema changes and read the data from these hive snapshots using the right schema definition. Is there any project which has implemented reading hive tables using dynamic schema definitions.
... View more