Support Questions

GrazittiAPI · ‎07-17-2025

Hello

We're testing the iceberg tables functionality on our CDP 7.3.1 environment, and after reviewing the documentation and performing some tests, it seems that the only way to compact the tables efficiently after modifications is via Spark.

Impala's 'optimize table' command rewrites the whole table (the file size threshold option is not implemented till version 4.5), and Hive seems not able to compact them in this version 4 included in our CDP.

The Spark procedures rewrite_position_delete_files and rewrite_data_files work correctly, but we wonder if we've misunderstood the documentation or maybe are missing any alternative.

Thanks in advance for your help.

Regards,

Miguel

DianaTorres · ‎07-17-2025

Hi @Shawn_Wang do you have any insights here? Thanks!

Regards,

Diana Torres,
Senior Community Moderator

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Support Questions

Iceberg table compaction in CDP 7.3.1