Could you help me understand why my inserts into transaction table (bucketed, stored as ORC) always involves one last reducer task?
Hive: 3.1.2 , Tez: 0.9
I mean, hive query plan looks good, it creates appropriate amount of stages / mapper tasks and reducer tasks according to the volume of data but it always has one last reducer with only one task.
I can not understand why it is so and it seems inefficient. I read that if there are multiple buckets, hive is able to write simultaneously to transnational table using multiple reducer tasks.
anyone any thoughts on this?
Could you provide some examples?