Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Seperation of hot and cold data -HBase

New Contributor

I have a table named 'X' and a column family 'cf. The table contains data of past 5 years. Old data are requested only few times whereas recent data are accessed frequently. I wanted to apply different storage policies for the data based on time. How can i configure ?


Also is it possible to specify different compression algorithms for hot and cold data in single column family? I am asking this because in HBase documentation, different algorithms are recommended for hot and cold data.


Super Collaborator

Hello @sachin_saju 


Thanks for using Cloudera Community. You have 2 ask in the Post:
1. How to configure different Storage Policies with Cold & Hot Data,

2. Applying different Compression Algorithm in 1 Column Family. 


For Q2, I believe the same isn't feasible i.e. Compression Algorithm can be set at CF level. Review [1] for the Compression Algorithm recommendation around Hot & Cold type data. 

For Q1, I assume you are referring to HDFS Storage Policy. If Yes, the same is configured uniformly i.e. I am not sure if we can apply different HDFS Storage Policy for different data within the same CF. In HBase, We generally recommend SSD [2] for WAL, else the HBase Data relies on HDFS Storage Policy used. Alternatively, Use BackUp-Restore [3] for having a "Cold" Version of Data, which can be restored as per requirement. 


Regards, Smarak






New Contributor

Hello @smdas 
Thanks for the response.


These links mention date tiered compaction policy in hbase. Does it somehow help in configuring different policy for same column family? or did i misunderstood?


Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.