Member since
10-18-2019
3
Posts
0
Kudos Received
0
Solutions
08-21-2020
08:02 AM
My table is very huge , we have billions of rows and approximatively 10 000 partitions. my exact question is to force a single bucket in each partition with a partionning clause and a clustered by (col) bucketing 1 when creating the table. so we always have only one file in the score. The other solution is to let the partitions fill up without a bucketing clause at creation but to compact the table to avoid full of files in the partition. I don't know which is the best solution. I think like you that we have a degradation of the performnaces during the loads because we no longer parrélize (only 1 bucket)
... View more
08-20-2020
05:31 AM
I would like to know if there were any perverse effects on using partitioned tables with a single bucket clause. This is to generate only one file in the partition. I use this because the compaction major process doesn't work very good. .I am using this solution to avoid having multiple small files in the partition. In fact, the compaction mechanism (Major) does not always allow only one file in the partition.
... View more
Labels:
- Labels:
-
Apache Hive