This is from Hive Manual:-
Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (say by using
hadoop fs -put command) or removed from HDFS, the metastore (and hence Hive) will not be aware of these changes to partition information unless the user runs
ALTER TABLE table_name ADD/DROP PARTITION commands on each of the newly added or removed partitions, respectively.
How to add new partitions directly using hadoop fs -put?
can someone give one example for this.To my knowledge, I know about alter table add partition or dynamic partition.
If we add new partitions into HDFS directory using "hadoop fs -put" command then "Hive partition External/Internal" table don't know about the newly added partitions in HDFS directories,
-> because hive partition table metadata is not updated if we add partitions directly in hadoop directory, Metadata will be updated when we write into Partition tables i.e
hive> insert into <table> partition(partition_filed) select * from <non_partitioned_table>;
and there are other syntax to write into hive partition tables using SQOOP/SPARK..etc.
To refresh the metadata of the Hive partitioned table we need to use either below ways.
1.Run metastore check with repair table option
hive> Msck repair table <db_name>.<table_name>;
If we added 10 partitions in Hdfs then above command will add all those partitions into Hive table
2.Adding each partition to the table
hive> alter table <db_name>.<table_name> add partition(partiton_field_name='<partition_value>') location '<hdfs_location_of the specific partition>';
If we added 10 partitions in HDFS directory above command will add only specific partition to the Hive table.
By using either of these ways we can update hive metadata with newly added partitions in the HDFS location.