I want to create hive external table with multiple partitioning collumns and should allow dynamic partitions. Suppose i create a table
create external table sample1( name string,age int) partitioned by (year int, month int,day int) location '/path to hdfs storage';
now i just run
msck repair table sample1;
So, is it correct method to achieve it and if its not can you help me to do it.
MSCK(Metastore check command) it is used when the partitions are added directly to HDFS directory(say by using hadoop fs -put command), To make aware of Hive metastore about these newly added partitions we are using msck repair table command.
msck repair table <table-name>;
We can add partitions by alter table as follows but in this case we need to run the command on each of the newly added partitions.
alter table partition_table add partition(<partition-name>) location '<directory-path>';
These commands won't create partitions or load data to the partitions,By running these commands we are going to add partitions to the table update the metastore with newly added partitions.
If you want to create and load the data to partitions refer the following link.
@Shu I know this is an old question but I have a similar use case and wanted to follow up to your response. If I run alter table add partition command, it will create an exclusive lock on that partition not allowing the end user to query that partition during loading. If I run MSCK command, will it create an exclusive lock or is the user prone to a dirty read while the command is running?