My hive table is having 200 TB of data and it's not compressed.
Could you please help me how to compress this table without dropping.
As simple as this:
CREATE TABLE t1_orc STORED AS ORC AS SELECT * FROM <your-existing-table>;
Note that if you have a single 200T table, this is going to take a while. You can test on a smaller table first.
If your table is partitioned you have to create it first as "STORED AS ORC" and then do " INSERT INTO" it listing all fields in SELECT. Also enable dynamic partitions.
set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; create table if not exists t1 (a int, b int) partitioned by (c int); -- your original table create table t1orc (a int, b int) partitioned by (c int) stored as ORC; -- your compressed table insert into table t1orc partition(c) select a, b, c from t1;
Thanks for your prompt response.
My table is partitioned table if I create new table with CTAS will it be also partitioned?
CTAS has these restrictions: