Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How to run manual compaction on the "nested partitions”, I tried searching on hive docs but could not find any example.

avatar
New Member

The nested partition structure is as below.

/dbdata/tblocation/ordersummery/tran_date=2016-04-06/ordertype=SUPE

hive> ALTER TABLE dtl.ordersummery PARTITION (tran_date='2016-04-06') COMPACT 'minor';

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Compaction can only be requested on one partition at a time.

hive> ALTER TABLE dtl.ordersummery PARTITION (tran_date='2016-04-06/ordertype=SUPE') COMPACT 'minor';

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Invalid partition spec specified

1 ACCEPTED SOLUTION

avatar
Super Guru

can you try this?

ALTER TABLE dtl.ordersummery PARTITION (tran_date='2016-04-06',ordertype='SUPE') COMPACT 'minor';

View solution in original post

5 REPLIES 5

avatar
Super Guru

can you try this?

ALTER TABLE dtl.ordersummery PARTITION (tran_date='2016-04-06',ordertype='SUPE') COMPACT 'minor';

avatar
New Member

thanks it works.

avatar
New Member

I'll add this to the HWKs hive docs. Thanks @Rajkumar Singh

avatar
Contributor

@Rajkumar Singh

  • Accepted all dependant configurations (default values).
  • Restarted Hive and any other related service (Stall configs)
  • Created a test table:
CREATETABLE resource.hello_acid (key int, value int)
PARTITIONED BY (load_date date)
CLUSTERED BY(key) INTO 3 BUCKETS
STORED AS ORC TBLPROPERTIES ('transactional'='true');
  • Inserted a few rows:
INSERT INTO hello_acid partition (load_date='2016-03-03') VALUES (1, 1);
INSERT INTO hello_acid partition (load_date='2016-03-03') VALUES (2, 2);
INSERT INTO hello_acid partition (load_date='2016-03-03') VALUES (3, 3);
  • Everything look great at this point. I've been able to add rows, and query the table as usual.
  • This is the content of HDFS directory of table partition (/apps/hive/warehouse/resource.db/hello_acid/load_date=2016-03-03/): 3 delta directories (1 per insert transaction)
  • In Hive, I issued a minor compaction command. If i understood it right, this should have merged all delta directories into one. Didn't work!
ALTER TABLE hello_acid partition (load_date='2016-03-03') COMPACT 'minor';
  • Next, I issued a major compaction command. This should have deleted all delta files and created a base file, with all the info. Didn't work either!
  • Finally, I ran this last command:
SHOW COMPACTIONS;

+-----------+-------------------------------+-----------------------+--------+------------+---------------------------------+----------------+--+

| dbname | tabname | partname | type | state | workerid | starttime |

+-----------+-------------------------------+-----------------------+--------+------------+---------------------------------+----------------+--+

| Database | Table | Partition | Type | State | Worker | Start Time |

| resource | hello_acid | load_date=2016-03-03 | MINOR | failed | hadoop-master1.claro.com.co-52 | 1506440161747 |

| resource | hello_acid | load_date=2016-03-03 | MAJOR | failed | hadoop-master2.claro.com.co-46 | 1506440185353 |

+-----------+-------------------------------+-----------------------+--------+------------+---------------------------------+----------------+--+

avatar
Rising Star

you are not getting the desired result as your compaction has failed. please check the yarn log to understand what might have gone wrong.