Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hbase case insensitive query

SOLVED Go to solution
Highlighted

hbase case insensitive query

Is there any way for hbase to return case insensitive values from the cells.

if a cell has value 'XYZ' and If a query it with 'xyZ' . is there a way to get the result back as value present?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: hbase case insensitive query

Mentor

@ARUNKUMAR RAMASAMY HBase has a concept of filters. For your gets and scans, you can setup a filter for your specific need. Consider using the following filter and comparator classes for case-incensitive queries. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html

and this is the class for filter https://hbase.apache.org/book.html#client.filter.cv.scvf

You can also try RegexStringComparator

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html

A SingleColumnValueFilter (see: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html) can be used to test column values for equivalence (CompareOp.EQUAL), inequality (CompareOp.NOT_EQUAL), or ranges (e.g., CompareOp.GREATER). The following is an example of testing equivalence of a column to a String value "my value"…

SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOp.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);
// example with Case Incensitive substring comparator
SingleColumnValueFilter scvf =
   new SingleColumnValueFilter("col", CompareOp.EQUAL,
     new SubstringComparator("substr"));

Cloudera has good documentation for HBase filtering in Java and CLI

http://www.cloudera.com/documentation/enterprise/5-2-x/topics/admin_hbase_filtering.html

2 REPLIES 2

Re: hbase case insensitive query

Mentor

@ARUNKUMAR RAMASAMY HBase has a concept of filters. For your gets and scans, you can setup a filter for your specific need. Consider using the following filter and comparator classes for case-incensitive queries. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html

and this is the class for filter https://hbase.apache.org/book.html#client.filter.cv.scvf

You can also try RegexStringComparator

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html

A SingleColumnValueFilter (see: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html) can be used to test column values for equivalence (CompareOp.EQUAL), inequality (CompareOp.NOT_EQUAL), or ranges (e.g., CompareOp.GREATER). The following is an example of testing equivalence of a column to a String value "my value"…

SingleColumnValueFilter filter = new SingleColumnValueFilter(
  cf,
  column,
  CompareOp.EQUAL,
  Bytes.toBytes("my value")
  );
scan.setFilter(filter);
// example with Case Incensitive substring comparator
SingleColumnValueFilter scvf =
   new SingleColumnValueFilter("col", CompareOp.EQUAL,
     new SubstringComparator("substr"));

Cloudera has good documentation for HBase filtering in Java and CLI

http://www.cloudera.com/documentation/enterprise/5-2-x/topics/admin_hbase_filtering.html

Re: hbase case insensitive query

@arunkumar

There is no direct way compare insensitive values from HBase. You need to write custom filter and add the jar to all region servers and client Or else you need to write custom coprocessor to check the value and not to skip the results when upper of value matching.

If you use phoenix you can run query with where condition on UPPER(column_name) = 'XYZ'. It's just simple. Phoenix do lot of things for us.