Support Questions
Find answers, ask questions, and share your expertise

how to compress existed table in Hive

Highlighted

how to compress existed table in Hive

New Contributor

Hi,

My hive table is having 200 TB of data and it's not compressed.

Could you please help me how to compress this table without dropping.

Thanks,

Ram

7 REPLIES 7
Highlighted

Re: how to compress existed table in Hive

Highlighted

Re: how to compress existed table in Hive

Super Guru

Really the only way is his CTAS if you want to change compression format or add compression. .

Highlighted

Re: how to compress existed table in Hive

New Contributor

could you please provide the sample code.

Highlighted

Re: how to compress existed table in Hive

As simple as this:

CREATE TABLE t1_orc STORED AS ORC AS SELECT * FROM <your-existing-table>;

Note that if you have a single 200T table, this is going to take a while. You can test on a smaller table first.

Highlighted

Re: how to compress existed table in Hive

If your table is partitioned you have to create it first as "STORED AS ORC" and then do " INSERT INTO" it listing all fields in SELECT. Also enable dynamic partitions.

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
create table if not exists t1 (a int, b int) partitioned by (c int);       -- your original table
create table t1orc (a int, b int) partitioned by (c int) stored as ORC;    -- your compressed table
insert into table t1orc partition(c) select a, b, c from t1;
Highlighted

Re: how to compress existed table in Hive

New Contributor

Hi,

Thanks for your prompt response.

My table is partitioned table if I create new table with CTAS will it be also partitioned?

Thanks

Highlighted

Re: how to compress existed table in Hive

Contributor

CTAS has these restrictions:

  • The target table cannot be a partitioned table.
  • The target table cannot be an external table.
  • The target table cannot be a list bucketing table.