Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

INSERT OVERWRITE TABLE not compact ORC Files in Partition

Highlighted

INSERT OVERWRITE TABLE not compact ORC Files in Partition

New Contributor

I have the external table :

PARTITIONED BY (

`y_m` int)

CLUSTERED BY (

l_n)

INTO 16 BUCKETS

STORED AS ORC;


when I try to compaction ORC files (5000 files in HDFS path), the files not compacted, and the size of single files not changed:

INSERT OVERWRITE TABLE TableName PARTITION (y_m='201906')

SELECT

...

....

FROM TableName

WHERE y_m='201906';

2 REPLIES 2

Re: INSERT OVERWRITE TABLE not compact ORC Files in Partition

Contributor

Re: INSERT OVERWRITE TABLE not compact ORC Files in Partition

New Contributor

Hi @Vikas Srivastava ,

Thank you for your suggestion,

I had already tried the suggestions of the page you linked, but there are 2 problem orders:

  • SET mapreduce.input.fileinputformat.split. * Cannot be changed at runtime;
  • The table uses bucketing, and the concatenate does not support Bucket; I get an error when I run it.

Best regards

Vincenzo