Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How many salt buckets should I use for my Phoenix tables?

How many salt buckets should I use for my Phoenix tables?

How do upserts of new records impact the number of pre-split regions?

How do updates of existing records impact the number of pre-split regions?

4 REPLIES 4
Highlighted

Re: How many salt buckets should I use for my Phoenix tables?

Guru

Since the number of salt buckets can only be set a table creation time this can be a little tricky. It takes a small amount of foresight in understanding your needs from the table AKA will the table be more read heavy or write heavy. A neutral stance would be to set the number of salt buckets to the number of Hbase RegionServers in your cluster. If you anticipate heavy write loads increasing that to something around {Hbase RegionServer Count * 1.20} which would increase the number of buckets by 20% and allow for a more distributed load. Increasing the salt buckets too high however may reduce your flexibility when you perform range based queries.

Re: How many salt buckets should I use for my Phoenix tables?

@Jeremy Dyer -- updated the question with additional items - any comments on those?

Re: How many salt buckets should I use for my Phoenix tables?

I was recently in a discussion with @Rajeshbabu Chintaguntla about this. If your table size is relatively small compare to the amount of block cache you have available (e.g. if you can cache your entire table), it makes sense to limit the number of salt-buckets to the number of region servers you have (similar to Jeremy's recommendation). However, once you start getting much larger tables, ones that will definitely not fit into cache and would require disk access, you're going to benefit by having more buckets available to distribute the load across multiple regions per server. I believe the consensus we had there was that something along the lines of 64 to 128 salt buckets would be a good starting point for 10's of region servers.

Obviously, this depends a lot on the number of region servers you're using too and the other users of HBase. If you're the only one using a 50node HBase cluster, the recommendations would be vastly different than one of 10 users using a 25node HBase cluster. "It depends" :)

Re: How many salt buckets should I use for my Phoenix tables?

Rising Star

Consider following points to decide salting buckets:

  1. No of region servers available
  2. Expected write throughput
  3. HBase key itself (If it is random enough(not to cause hotspots) than i will suggest pre-splitting without salting to get better scans)
  4. Increasing salt buckets to high number may result in slower scans(depending on table size and scan)