Support Questions

Find answers, ask questions, and share your expertise

hbase case insensitive query

avatar

Is there any way for hbase to return case insensitive values from the cells.

if a cell has value 'XYZ' and If a query it with 'xyZ' . is there a way to get the result back as value present?

1 ACCEPTED SOLUTION

avatar
Master Mentor

@ARUNKUMAR RAMASAMY HBase has a concept of filters. For your gets and scans, you can setup a filter for your specific need. Consider using the following filter and comparator classes for case-incensitive queries. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html

and this is the class for filter https://hbase.apache.org/book.html#client.filter.cv.scvf

You can also try RegexStringComparator

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html

A SingleColumnValueFilter (see: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html) can be used to test column values for equivalence (CompareOp.EQUAL), inequality (CompareOp.NOT_EQUAL), or ranges (e.g., CompareOp.GREATER). The following is an example of testing equivalence of a column to a String value "my value"…

SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOp.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);
// example with Case Incensitive substring comparator
SingleColumnValueFilter scvf =
   new SingleColumnValueFilter("col", CompareOp.EQUAL,
     new SubstringComparator("substr"));

Cloudera has good documentation for HBase filtering in Java and CLI

http://www.cloudera.com/documentation/enterprise/5-2-x/topics/admin_hbase_filtering.html

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@ARUNKUMAR RAMASAMY HBase has a concept of filters. For your gets and scans, you can setup a filter for your specific need. Consider using the following filter and comparator classes for case-incensitive queries. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html

and this is the class for filter https://hbase.apache.org/book.html#client.filter.cv.scvf

You can also try RegexStringComparator

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html

A SingleColumnValueFilter (see: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html) can be used to test column values for equivalence (CompareOp.EQUAL), inequality (CompareOp.NOT_EQUAL), or ranges (e.g., CompareOp.GREATER). The following is an example of testing equivalence of a column to a String value "my value"…

SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOp.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);
// example with Case Incensitive substring comparator
SingleColumnValueFilter scvf =
   new SingleColumnValueFilter("col", CompareOp.EQUAL,
     new SubstringComparator("substr"));

Cloudera has good documentation for HBase filtering in Java and CLI

http://www.cloudera.com/documentation/enterprise/5-2-x/topics/admin_hbase_filtering.html

avatar

@arunkumar

There is no direct way compare insensitive values from HBase. You need to write custom filter and add the jar to all region servers and client Or else you need to write custom coprocessor to check the value and not to skip the results when upper of value matching.

If you use phoenix you can run query with where condition on UPPER(column_name) = 'XYZ'. It's just simple. Phoenix do lot of things for us.