Support Questions

Find answers, ask questions, and share your expertise

Choice for a Web Application Cache

avatar
Master Guru

Would you use HBase?

Or an in-memory grid like Apache Ignite or Apache Geode.

1 ACCEPTED SOLUTION

avatar
Guru

@Timothy Spann

An in-memory data grid is much more than just a cache. Some key capabilities are:

  • Very granular control over the data being stored
  • Technology agnostic serialization that enables access to cached data from several different tools (Java, C#, C++, ect)
  • Loading of data on cache miss from any backing store
  • Write-through/Write-Behind to any backing store
  • Ability to off-load processing of instruction sets on individual cached entries or in map/reduce style batch
  • Eventing framework providing notification of changes to individual entries or job execution
  • Tiered caching (on-heap, off-heap, disk)

HBase is an excellent NoSQL columnar data store but when it comes to dealing with data in memory, all it offers is an LRU caching and eviction scheme with no very little control over what data gets and stays cached. In fact the only control knob is how much memory is allocated for caching per region server. Given that HBase actually stores data with durability, it is often a great choice for access for OLTP use cases. In fact, In-memory data grids are rarely used without a backing store like HBase. However, for application acceleration, processing, and functionality offload, an In-memory data grid can provide capabilities that HBase alone cannot.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

What is the access pattern for the web application ?

HBase has Bucket cache (off heap) along with block cache (on heap).

After tuning, hbase can deliver good caching performance. However, few people use it as in-memory caching solution.

HDP currently doesn't support Apache Ignite or Apache Geode.

avatar
Rising Star

HBase is not a good alternative to a memory-based distributed caches.

avatar
Guru

@Timothy Spann

An in-memory data grid is much more than just a cache. Some key capabilities are:

  • Very granular control over the data being stored
  • Technology agnostic serialization that enables access to cached data from several different tools (Java, C#, C++, ect)
  • Loading of data on cache miss from any backing store
  • Write-through/Write-Behind to any backing store
  • Ability to off-load processing of instruction sets on individual cached entries or in map/reduce style batch
  • Eventing framework providing notification of changes to individual entries or job execution
  • Tiered caching (on-heap, off-heap, disk)

HBase is an excellent NoSQL columnar data store but when it comes to dealing with data in memory, all it offers is an LRU caching and eviction scheme with no very little control over what data gets and stays cached. In fact the only control knob is how much memory is allocated for caching per region server. Given that HBase actually stores data with durability, it is often a great choice for access for OLTP use cases. In fact, In-memory data grids are rarely used without a backing store like HBase. However, for application acceleration, processing, and functionality offload, an In-memory data grid can provide capabilities that HBase alone cannot.