for e.g.
I have a table in my source database having columns monthid (int),
monthshort (string). I copy the data of the table daily using nifi and store in hdfs.
I have created external table in hive:
CREATE external table if not exists test.s3amonths(
monthid int,
monthshort string )
PARTITIONED BY (load_date string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION 'hdfs://xxx/bshah-s3hdptest/test/months';
daily loading the data using following statement:
alter table test.s3amonths add IF NOT EXISTS partition (load_date='2016-12-10') location 'hdfs://xxx/bshah-s3hdptest/test/months/load_date=2016-12-10';
alter table teset.s3amonths add IF NOT EXISTS partition (load_date='2016-12-11') location 'hdfs://xxx/bshah-s3hdptest/test/months/load_date=2016-10-11';
Now the schema in the source table change to : monthid (int), monthlong (string),monthshort (string)
when loading the new partition how do I ensure that the existing data will not be affected and the new data having additional column information will also be loaded successfully