Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive: Difference between CONCATENATE and COMPACTION command

Highlighted

Hive: Difference between CONCATENATE and COMPACTION command

Expert Contributor

I'm working with two tables in Hive (version 1.2.1 in HDP 2.6.5):

  • tableA (ORC, not transactional)
  • tableB (ORC, transactional, with bucketing).

We periodically run a ALTER TABLE tableA CONCATENATE query to merge the small ORC files on HDFS to bigger files.

For my transctional table (with buckets) I have to use the ALTER TABLE tableB COMPACTION 'major|minor' query to merge the small files in my HDFS warehouse directory.

My question here is: What is the difference between these two commands? And is it possible to use the compaction / concatenate command in the other table somehow? Why are there two different commands for this file merge, if they do the same in general?

Thank you!