- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Concurrent Users in hbase
- Labels:
-
Apache HBase
Created on ‎07-26-2017 02:12 AM - edited ‎09-16-2022 04:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are lot of docs available related to the horizontal scalability but no where I see how to handle concurrent users say for eg handling 50K users request at a time. My requirement is I have millions of records stored in the hbase for multiple users. When multiple users try to access their data (say for around 50k), will hbase able to handle this much amount of concurrent users ? Is there any way to scale up connections (there are few param, but not up to 50 K, i believe), Please suggest a solution
Created ‎07-26-2017 10:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have a look at the below two parameters
hbase.client.max.perregion.tasks
The maximum number of concurrent connections the client will maintain to a single Region. That is, if there is already hbase.client.max.perregion.tasks writes in progress for this region, new puts won't be sent to this region until some writes finishes.
hbase.zookeeper.property.maxClientCnxns
- Number of concurrent connections which can be made to a single member of ZK ensemble from a single client and this value should match the value in zoo.cfg. This need to be adjusted taking into consideration the expected number of HBase client connections.
Created ‎08-02-2017 05:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It depends on what 50K users is doing. (if your cluster capacity and configuration is right, you can scale horizontally without any problem)
- If it is just the point lookup(key value access) then depending upon the disk(SSD/HDD) you are using, you should be able to scale without any problem, some basic configuration tweak is required, like increasing no. of handlers for datanode and regionserver, block cache/bucket cache etc
- If you are doing heavy scans then you may need a large cluster which can bear this load. Network, CPU and disk will play an important role.
