Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive partition clarification

Highlighted

Hive partition clarification

Contributor

This is from Hive Manual:-

Recover Partitions (MSCK REPAIR TABLE)

Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (say by using hadoop fs -put command) or removed from HDFS, the metastore (and hence Hive) will not be aware of these changes to partition information unless the user runs ALTER TABLE table_name ADD/DROP PARTITION commands on each of the newly added or removed partitions, respectively.

Doubt:-

How to add new partitions directly using hadoop fs -put?

can someone give one example for this.To my knowledge, I know about alter table add partition or dynamic partition.

1 REPLY 1

Re: Hive partition clarification

Super Guru

@vamsi valiveti

If we add new partitions into HDFS directory using "hadoop fs -put" command then "Hive partition External/Internal" table don't know about the newly added partitions in HDFS directories,
-> because hive partition table metadata is not updated if we add partitions directly in hadoop directory, Metadata will be updated when we write into Partition tables i.e

hive> insert into <table> partition(partition_filed) select * from <non_partitioned_table>;

and there are other syntax to write into hive partition tables using SQOOP/SPARK..etc.

To refresh the metadata of the Hive partitioned table we need to use either below ways.

1.Run metastore check with repair table option

hive> Msck repair table <db_name>.<table_name>;

If we added 10 partitions in Hdfs then above command will add all those partitions into Hive table

(or)

2.Adding each partition to the table

hive> alter table <db_name>.<table_name> add partition(partiton_field_name='<partition_value>') location '<hdfs_location_of the specific partition>';

If we added 10 partitions in HDFS directory above command will add only specific partition to the Hive table.

By using either of these ways we can update hive metadata with newly added partitions in the HDFS location.