Created 01-11-2017 03:37 PM
Hi, i created an external table in HIVE with 150 columns. i have a .csv file for each day , and eventually i will have to load data for 4 years.
i just loaded one month worth of files which turned into 2mill rows.
i now like to partition the table by date (which first column in the table and file).
1.how can i partion on existing table and 2. can i do it dynamically when i load other files.
Regards,Sai
Created 01-11-2017 07:13 PM
1. You can add partitions to existing table using ALTER TABLE <table_name> ADD PARTITION (date=' ') LOCATION '/path'
2. For dynamic partitioning:
--Set following two properties for your Hive session:
SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=nonstrict;
--Create your staging table without Partition
create external table table_name (col_01,col_02) row format delimited fields terminated by ",";
--load data to your Hive Table:
load data local inpath '/path' into table table_name;
--Create production table with the columns you want to partition upon. In this case you want to partition upon date column:
create external table table_name (col_01,col_02) partitioned by (date_part string) row format delimited fields terminated by ',';
insert overwrite table table_name partition(date_part) select col_01,col_02 from staging_table;
--Verify data is correctly populated in partition:
hive> select * from table_name where date_part="20170111";
Regards,
Anubhav
Created 01-11-2017 07:58 PM
Hi @aawasthi,
suppose i have an external table "test_display" which points to '/user/test/display' which already have 30 files one for each day. and i can query the table. when i created the table i did not specify any partition.
now like you mentioned above i tried to execute the command below
ALTER TABLE test_display ADD PARTITION (moat_date='2017-01-05') LOCATION '/user/test/display'
Error occurred executing hive query: Error while compiling statement: FAILED: ValidationFailureSemanticException table is not partitioned but partition spec exists: {moat_date=2017-01-05}
do i have to manually create the folder 2017-01-05 under '/user/test/display'
Created 01-11-2017 09:06 PM
Your create table statement should define partition name.
CREATE EXTERNAL TABLE table_name ( ... ) PARTITIONED BY (moat_date STRING);
Created 02-01-2017 07:06 PM
Can you provide output of below?
1. create table def
hive> Show create table test_display
2. Files in the hdfs location
hadoop fs -ls <table_display table hdfs location>