Support Questions

Find answers, ask questions, and share your expertise

HBase (HFILE) with Hadoop EC

avatar
New Contributor

Hello, 

I've heared that Hbase is not officially compatibile with HDFS-EC( Erasure Coding policy) due to the performance limitation and unsupported operations of hdfs (e.g, hflush or hsync).

 

Can any errors or problems arise if I use HBase for read-only workloads?

 

For example, I directly make hfiles from raw data (using Spark) and bulk load the hfiles to HBase, and HBase clients only invoke read (get) requests.  (Also, there is no hbase compaction) 

 

In the above scenario, I think hbase can be work with HDFS-EC due to the following reasons. 

1.  Since data are written in bulk,  the above unsupported operations do not matter. 

2. In my experiance, thought HDFS-EC's write performance is about 3 times slower than the HDFS-REP (3 copy), the  HDFS-EC's read performance is almost the same as the HDFS-REP. Therefore performance limitation seems not matter.

 

I want some advice for the possibility of this hbase+hdfs-ec architecture in this scenario. 

 

0 REPLIES 0