Support Questions
Find answers, ask questions, and share your expertise

Hive manual compactions worker status keep initiated

New Contributor

Hi there,

I am using manual compactions to my table. but when I run

show compactions;

it shows :

Database Table Partition Type State Worker Start Time Duration(ms) HadoopJobId

default adevents date=2014-07-07 MINOR initiated --- --- --- None

default adevents date=2010-08-13 MAJOR initiated --- --- --- None

Time taken: 0.009 seconds, Fetched: 3 row(s)

No workerid nor Jobid, and status keeps initated.

Could anyone please help me solve this?

FYI I have changed the compactor settings in hive-site.xml:

<property> <name>hive.compactor.initiator.on</name> <value>true</value> </property> <property> <name>hive.compactor.worker.threads</name> <value>1</value> </property>

Thank you so much!


Expert Contributor

Do you have the standalone metastore running? that is where compaction jobs are actually generated and submitted.

New Contributor

I am using Hive in EMR. It is supposed to use metastore in Hive. Do I need to add an external matestore for Hive, like GLUE Data Catalog? But the pain is Glue does not support Hive transactions. Do you know how can I solve it? Thank you so much for your reply! @Eugene Koifman

Expert Contributor

Hive metastore is a service. That may run in embedded mode and in stand alone mode. You can have several instances running in your cluster, all instances must share the same backend RDMBS. This should help:

New Contributor

Thank so much for your resources!


Hi @Xue Chen!

I have a couple of questions.

1) How long do you see it in initiated mode?

2) Does the compaction ever happen?

Normally, I've faced issues with compaction if I run the compaction with a user other than hive. Also, the owner of the table should also be hive.

You could try setting the owner of the table to hive and run the compactions as hive user if not done already.

Please let me know if that helps you!



New Contributor

Thank you for your reply!

Compaction never happen.

I used Hive in ASW EMR. create the table from S3 ORC files using the following command:

CREATE TABLE tablename (id STRING, ts TIMESTAMP, category STRING) 
clustered by (category) into 1 buckets
STORED AS ORC LOCATION 's3://orc-mutation-test/20/'
tblproperties ("orc.compress"="SNAPPY","transactional"="true");

In this way, the user and owner of this table is Hive, right?

I feel it was because I did not add in remote metastore, so the compaction cannot be generated and submitted.

; ;