Created 03-16-2017 04:59 AM
Hi,
My hive table is having 200 TB of data and it's not compressed.
Could you please help me how to compress this table without dropping.
Thanks,
Ram
Created 03-16-2017 05:01 AM
This post has a pretty good explanation:
https://community.hortonworks.com/questions/28856/hive-table-format-and-compression.html
Created 03-16-2017 05:11 AM
Really the only way is his CTAS if you want to change compression format or add compression. .
Created 03-16-2017 05:17 AM
could you please provide the sample code.
Created 03-16-2017 05:42 AM
As simple as this:
CREATE TABLE t1_orc STORED AS ORC AS SELECT * FROM <your-existing-table>;
Note that if you have a single 200T table, this is going to take a while. You can test on a smaller table first.
Created 03-16-2017 07:48 AM
If your table is partitioned you have to create it first as "STORED AS ORC" and then do " INSERT INTO" it listing all fields in SELECT. Also enable dynamic partitions.
set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; create table if not exists t1 (a int, b int) partitioned by (c int); -- your original table create table t1orc (a int, b int) partitioned by (c int) stored as ORC; -- your compressed table insert into table t1orc partition(c) select a, b, c from t1;
Created 03-16-2017 05:53 AM
Hi,
Thanks for your prompt response.
My table is partitioned table if I create new table with CTAS will it be also partitioned?
Thanks
Created 03-16-2017 06:05 AM
CTAS has these restrictions: