Support Questions

Find answers, ask questions, and share your expertise

How to abort HIVE-compactions

avatar
Explorer

Hi all!

We've come to strange situation with transactional tables. For several dropped tables we've see many compactions with "attempted" state. There is no files, no tables, but compactions still in compaction queue. As i understand this compactions never will be completed. Is there any way to remove this compactions from queue?

Will be pleased any help.

P.S. Sorry for my english.

3 REPLIES 3

avatar
Super Guru

@Artur Dushelyubov

Attempted means that the initiator attempted to schedule a compaction but it failed. As such, there will be no compaction associated with those requests. Other than visually unpleasant, there is no reason to worry about. They are part of the metastore log and it will show upto the threshold set. See threshold for attempted below. They will not be displayed after that number.

Compaction History
hive.compactor.history.retention.succeededDefault: 3MetastoreNumber of successful compaction entries to retain in history (per partition).
hive.compactor.history.retention.failedDefault: 3MetastoreNumber of failed compaction entries to retain in history (per partition).
hive.compactor.history.retention.attemptedDefault: 2MetastoreNumber of attempted compaction entries to retain in history (per partition).
hive.compactor.initiator.failed.compacts.thresholdDefault: 2MetastoreNumber of of consecutive failed compactions for a given partition after which the Initiator will stop attempting to schedule compactions automatically. It is still possible to use ALTER TABLE to initiate compaction. Once a manually initiated compaction succeeds auto initiated compactions will resume. Note that this must be less than hive.compactor.history.retention.failed.
hive.compactor.history.reaper.intervalDefault: 2mMetastoreControls how often the process to purge historical record of compactions runs.

If this was helpful, please vote/accept best answer.

++++++

A little bit of theory below for others who may have a similar question.

SHOW COMPACTIONS returns a list of all tables and partitions currently being compacted or scheduled for compaction when Hive transactions are being used, including this information:

  • database name
  • table name
  • partition name (if the table is partitioned)
  • whether it is a major or minor compaction
  • the state the compaction is in, which can be:
    • "initiated" – waiting in the queue to be compacted
    • "working" – being compacted
    • "ready for cleaning" – the compaction has been done and the old files are scheduled to be cleaned
    • "failed" – the job failed. The metastore log will have more detail.
    • "succeeded" – A-ok
    • "attempted" – initiator attempted to schedule a compaction but failed. The metastore log will have more information.
  • thread ID of the worker thread doing the compaction (only if in working state)
  • the time at which the compaction started (only if in working or ready for cleaning state)

Compactions are initiated automatically, but can also be initiated manually with an ALTER TABLE COMPACT statement.

avatar
New Contributor

Hello @Constantin Stanca , Where do we set these configuration variables for them to take effect?

avatar
Super Collaborator

This may be due to not having https://issues.apache.org/jira/browse/HIVE-10632 in your build. There is some data in internal tables that was not cleaned when tables were dropped which is causing Initiator to try to schedule compactions.