Created 12-26-2022 10:01 PM
I have a table named 'X' and a column family 'cf. The table contains data of past 5 years. Old data are requested only few times whereas recent data are accessed frequently. I wanted to apply different storage policies for the data based on time. How can i configure ?
Also is it possible to specify different compression algorithms for hot and cold data in single column family? I am asking this because in HBase documentation, different algorithms are recommended for hot and cold data.
Created 12-27-2022 10:30 PM
Hello @sachin_saju
Thanks for using Cloudera Community. You have 2 ask in the Post:
1. How to configure different Storage Policies with Cold & Hot Data,
2. Applying different Compression Algorithm in 1 Column Family.
For Q2, I believe the same isn't feasible i.e. Compression Algorithm can be set at CF level. Review [1] for the Compression Algorithm recommendation around Hot & Cold type data.
For Q1, I assume you are referring to HDFS Storage Policy. If Yes, the same is configured uniformly i.e. I am not sure if we can apply different HDFS Storage Policy for different data within the same CF. In HBase, We generally recommend SSD [2] for WAL, else the HBase Data relies on HDFS Storage Policy used. Alternatively, Use BackUp-Restore [3] for having a "Cold" Version of Data, which can be restored as per requirement.
Regards, Smarak
[1] https://hbase.apache.org/book.html#data.block.encoding.types
[2] https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/configuring-hbase/topics/hbase-configure-stor...
[3] https://hbase.apache.org/book.html#br.overview
Created on 12-27-2022 11:27 PM - edited 12-27-2022 11:28 PM
Hello @smdas
Thanks for the response.
These links mention date tiered compaction policy in hbase. Does it somehow help in configuring different policy for same column family? or did i misunderstood?