I am new to HBase DB and currently evaluating it for one of the requirement we have from Customer.
We are going to write TBs of data in HBase daily and we need to fetch specifc data based on filter.
I came to know that it is very important to design the row key in such a manner, so that it effectively uses it to fetch the data from the specific node instead of scanning thru all the records in the database, based on the type of row key we design.
The problem with our requirement is that, we don't have any specific field which can be used to define the rowkey. We have around 7-8 fields available on the frontend, which can be used to filter the records from HBase.
Can you please suggest, what should be the design of my row key, which will help in faster retrieval of the data from TBs of data?
Attaching here the sample screen I am referring in this post.