Member since
11-23-2022
3
Posts
0
Kudos Received
0
Solutions
08-23-2023
10:57 PM
Is anyone aware if there are any plans for hflush implementation in EC in future versions?
... View more
07-20-2023
10:58 PM
Hello, I've heared that Hbase is not officially compatibile with HDFS-EC( Erasure Coding policy) due to the performance limitation and unsupported operations of hdfs (e.g, hflush or hsync). Can any errors or problems arise if I use HBase for read-only workloads? For example, I directly make hfiles from raw data (using Spark) and bulk load the hfiles to HBase, and HBase clients only invoke read (get) requests. (Also, there is no hbase compaction) In the above scenario, I think hbase can be work with HDFS-EC due to the following reasons. 1. Since data are written in bulk, the above unsupported operations do not matter. 2. In my experiance, thought HDFS-EC's write performance is about 3 times slower than the HDFS-REP (3 copy), the HDFS-EC's read performance is almost the same as the HDFS-REP. Therefore performance limitation seems not matter. I want some advice for the possibility of this hbase+hdfs-ec architecture in this scenario.
... View more
Labels:
02-27-2023
10:54 PM
1 Kudo
Hello @bgkim Thanks for using Cloudera Community. To your Q, the Composite Primary Key would require using both A & B in WHERE Clause as the Indexing is done collectively. As such, Your SELECT Query would ideally benefit upon creating a Local Index on A & C. You may review [1] as Read-Heavy Use-Case benefit via Global Index with Penalty incurred during Writes. Additionally, Phoenix offers Covered Index & Explain Plan helps confirming the Index Usage. Link [2] offers few examples as well. With all recommendations, Best Advise is always to review the Performance internally prior to implementing them in Production. Regards, Smarak [1] https://phoenix.apache.org/secondary_indexing.html [2] https://learn.microsoft.com/en-us/azure/hdinsight/hbase/apache-hbase-phoenix-performance
... View more