Support Questions

Find answers, ask questions, and share your expertise

Seperation of hot and cold data -HBase

avatar
Explorer

I have a table named 'X' and a column family 'cf. The table contains data of past 5 years. Old data are requested only few times whereas recent data are accessed frequently. I wanted to apply different storage policies for the data based on time. How can i configure ?

 

Also is it possible to specify different compression algorithms for hot and cold data in single column family? I am asking this because in HBase documentation, different algorithms are recommended for hot and cold data.


2 REPLIES 2

avatar
Super Collaborator

Hello @sachin_saju 

 

Thanks for using Cloudera Community. You have 2 ask in the Post:
1. How to configure different Storage Policies with Cold & Hot Data,

2. Applying different Compression Algorithm in 1 Column Family. 

 

For Q2, I believe the same isn't feasible i.e. Compression Algorithm can be set at CF level. Review [1] for the Compression Algorithm recommendation around Hot & Cold type data. 

For Q1, I assume you are referring to HDFS Storage Policy. If Yes, the same is configured uniformly i.e. I am not sure if we can apply different HDFS Storage Policy for different data within the same CF. In HBase, We generally recommend SSD [2] for WAL, else the HBase Data relies on HDFS Storage Policy used. Alternatively, Use BackUp-Restore [3] for having a "Cold" Version of Data, which can be restored as per requirement. 

 

Regards, Smarak

 

[1] https://hbase.apache.org/book.html#data.block.encoding.types

[2] https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/configuring-hbase/topics/hbase-configure-stor...
[3] https://hbase.apache.org/book.html#br.overview

 

 

avatar
Explorer

Hello @smdas 
Thanks for the response.

 

These links mention date tiered compaction policy in hbase. Does it somehow help in configuring different policy for same column family? or did i misunderstood?