Hello
We're testing the iceberg tables functionality on our CDP 7.3.1 environment, and after reviewing the documentation and performing some tests, it seems that the only way to compact the tables efficiently after modifications is via Spark.
Impala's 'optimize table' command rewrites the whole table (the file size threshold option is not implemented till version 4.5), and Hive seems not able to compact them in this version 4 included in our CDP.
The Spark procedures rewrite_position_delete_files and rewrite_data_files work correctly, but we wonder if we've misunderstood the documentation or maybe are missing any alternative.
Thanks in advance for your help.
Regards,
Miguel