Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive: Difference between CONCATENATE and COMPACTION command

Hive: Difference between CONCATENATE and COMPACTION command

Expert Contributor

I'm working with two tables in Hive (version 1.2.1 in HDP 2.6.5):

  • tableA (ORC, not transactional)
  • tableB (ORC, transactional, with bucketing).

We periodically run a ALTER TABLE tableA CONCATENATE query to merge the small ORC files on HDFS to bigger files.

For my transctional table (with buckets) I have to use the ALTER TABLE tableB COMPACTION 'major|minor' query to merge the small files in my HDFS warehouse directory.

My question here is: What is the difference between these two commands? And is it possible to use the compaction / concatenate command in the other table somehow? Why are there two different commands for this file merge, if they do the same in general?

Thank you!

Don't have an account?
Coming from Hortonworks? Activate your account here