Reply
Explorer
Posts: 22
Registered: ‎12-09-2015

Hive alter table concatenate behaves oddly

I have dozens of tables with daily partitions, some of which require concatenation after creation, some of which don't. I'm not sure what to expect when I call concatenate on these partitions. Should it produce (bytecount/blocksize) files of just under the blocksize? Should it produce (square root of line count) files of indeterminate size? Is there a way to tune it?

 

Specifically, I'm trying to reduce my small file problem, but I don't want to call concatenate on partitions if it won't actually do anything.

 

Thanks in advance.