Support Questions

mwblee · ‎05-14-2023

Hi,

I am using hive 3.1.2 with hadoop 3.3.5. I manually trigger minor compaction but it's stuck in initiated state. I can't find any documentation on how to configure for workers. I read somewhere it is automatical when you set hive.txn.manager to

org.apache.hadoop.hive.ql.lockmgr.DbTxnManager and

hive.support.concurrency to true. I have also set below parameters but it is still stuck in initiated state.

$ hive -e 'show compactions' 2>/dev/null
CompactionId Database Table Partition Type State Worker Start Time Duration(ms) HadoopJobId
1 default employee_trans --- MINOR initiated --- --- --- ---

hive.compactor.abortedtxn.threshold=200
hive.compactor.check.interval=100s
hive.compactor.cleaner.run.interval=500ms
hive.compactor.compact.insert.only=true
hive.compactor.delta.num.threshold=2
hive.compactor.delta.pct.threshold=0.1
hive.compactor.history.reaper.interval=2m
hive.compactor.history.retention.attempted=2
hive.compactor.history.retention.failed=3
hive.compactor.history.retention.succeeded=3
hive.compactor.initiator.failed.compacts.threshold=5
hive.compactor.initiator.on=true
hive.compactor.max.num.delta=100
hive.compactor.worker.threads=2
hive.compactor.worker.timeout=400s
hive.metastore.thrift.compact.protocol.enabled=false
hive.test.fail.compaction=false

$ hdfs dfs -ls /user/hive/warehouse/employee_trans;
Found 10 items
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:02 /user/hive/warehouse/employee_trans/delta_0000001_0000001_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:09 /user/hive/warehouse/employee_trans/delta_0000002_0000002_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:09 /user/hive/warehouse/employee_trans/delta_0000003_0000003_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:10 /user/hive/warehouse/employee_trans/delta_0000004_0000004_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:13 /user/hive/warehouse/employee_trans/delta_0000005_0000005_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:13 /user/hive/warehouse/employee_trans/delta_0000006_0000006_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:14 /user/hive/warehouse/employee_trans/delta_0000007_0000007_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:15 /user/hive/warehouse/employee_trans/delta_0000008_0000008_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:15 /user/hive/warehouse/employee_trans/delta_0000009_0000009_0000
drwxr-xr-x - hadoop supergroup 0 2023-05-11 09:16 /user/hive/warehouse/employee_trans/delta_0000010_0000010_0000

smruti · ‎05-20-2023

@mwblee I am not sure if you are using any Cloudera Hive distribution, if you are, consider upgrading to latest CDP version where we have fixed many issues around compactor initiator/worker/cleaner.

e.g. For initiator: upstream Jiras - HIVE-21917, HIVE-22568, HIVE-22081

For this specific issue, you may check take a look at multiple factors, such as Hive metastore being overloaded, slow/large(certain txn related tables) metastore database.

You may enable DEBUG logging in Hive metastore, and this will provide more information on why/where the compactor is stuck. If you are using opensource Hive, upgrade to Hive 4.x; you will have much better experience w.r.t. compaction.

Cloudera Community

Support Questions

hive compaction stuck in initiated state