Support Questions

Find answers, ask questions, and share your expertise

Iceberg table compaction in CDP 7.3.1

avatar
Explorer

Hello

We're testing the iceberg tables functionality on our CDP 7.3.1 environment, and after reviewing the documentation and performing some tests, it seems that the only way to compact the tables efficiently after modifications is via Spark.

Impala's 'optimize table' command rewrites the whole table (the file size threshold option is not implemented till version 4.5), and Hive seems not able to compact them in this version 4 included in our CDP.

The Spark procedures rewrite_position_delete_files and rewrite_data_files work correctly, but we wonder if we've misunderstood the documentation or maybe are missing any alternative.

Thanks in advance for your help.

Regards,

Miguel

1 REPLY 1

avatar
Community Manager

Hi @Shawn_Wang do you have any insights here? Thanks!


Regards,

Diana Torres,
Senior Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: