Support Questions

bshah1 · ‎03-26-2017

for e.g.

I have a table in my source database having columns monthid (int), monthshort (string). I copy the data of the table daily using nifi and store in hdfs.

I have created external table in hive:

CREATE external table if not exists test.s3amonths( 
monthid                 int,
monthshort              string ) 
PARTITIONED BY (load_date string) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
LOCATION 'hdfs://xxx/bshah-s3hdptest/test/months';

daily loading the data using following statement:

alter table test.s3amonths add IF NOT EXISTS partition (load_date='2016-12-10') location 'hdfs://xxx/bshah-s3hdptest/test/months/load_date=2016-12-10';
alter table teset.s3amonths add IF NOT EXISTS partition (load_date='2016-12-11') location 'hdfs://xxx/bshah-s3hdptest/test/months/load_date=2016-10-11';

Now the schema in the source table change to : monthid (int), monthlong (string),monthshort (string)

when loading the new partition how do I ensure that the existing data will not be affected and the new data having additional column information will also be loaded successfully

rreddy · ‎03-27-2017

One option is to delete existing external table and create new table that includes new column. Since this is Hive metadata operation, your data files wont be touched. Downside is that you will have to execute alter table command to redefine partitions on new table.

Cloudera Community

Support Questions

how to add partition to existing table having different column information