Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

lower() required to find numeric value in Phoenix table

lower() required to find numeric value in Phoenix table

New Contributor

I have a table created in Phoenix with data all varchar and populated through Sqoop to underlying HBASE. The load is working, but upon query, I am required to lowercase all values in order to find them. I don't want to do this because I shouldn't have to do this and it kills performance.

Field THIS_ADDRESS a varchar column created through Phoenix with values loaded via a Sqoop to the underlying HBASE table.

/usr/bin/sqoop import -Dmapreduce.output.fileoutputformat.compress=false ...... --query 'select THIS_ADDRESS from THIS_ADDRESS_SOURCE where THIS_ADDRESS is not null and $CONDITIONS' --hbase-table MYTABLE --column-family MYTABLE --hbase-row-key THIS_ADDRESS -m 1

e.g.

0: jdbc:phoenix:ey9omprna016.vzbi.com:2181:/h> select THIS_ADDRESS from MYTABLE where THIS_ADDRESS='00000000139';

+------------------------------------------+------------------------------------------+ | THIS_ADDRESS |

+------------------------------------------+------------------------------------------+ +------------------------------------------+------------------------------------------+ No rows selected (0.032 seconds)

0: jdbc:phoenix:ey9omprna016.vzbi.com:2181:/h> select THIS_ADDRESS from MYTABLE where THIS_ADDRESS=lower('00000000139'); +------------------------------------------+------------------------------------------+ | THIS_ADDRESS | +------------------------------------------+------------------------------------------+ +------------------------------------------+------------------------------------------+

| 00000000139 | +------------------------------------------+------------------------------------------+

7 REPLIES 7

Re: lower() required to find numeric value in Phoenix table

The data is stored as a string and not a number. You even quoted the value in your select statement, not to mention that the LOWER function only accepts a string (https://phoenix.apache.org/language/functions.html#lower).

Use a numeric type for your table schema if you want to treat the data as numbers.

Re: lower() required to find numeric value in Phoenix table

New Contributor

Actually the column has some alphanumeric values as well, this example just happens to be numeric and therefore I need to use a string. I chose the all numeric value for the example because it is strange that I would need to lowercase a value which consists of all digits in order to find it. That doesn't make sense.

Re: lower() required to find numeric value in Phoenix table

Ah ok. Yes, I would agree that it is strange as `'00000000139' = lower('00000000139')` should be true. I would have thought that those values were equivalent. Maybe you could look at the raw data in the value of this table using the `hbase shell`?

Re: lower() required to find numeric value in Phoenix table

Also, can you try to do a little trick to see if the following works?

select 1 as A where '1234' = lower('1234');

This should print out the column A with a value of "1". If it doesn't (empty value), then it might just be a Phoenix bug. This worked for me on my local machine.

Re: lower() required to find numeric value in Phoenix table

New Contributor

Your test provides the expected result

0: jdbc:phoenix:ey9omprna016.vzbi.com:2181:/h> select 1 as A from MYTABLE where '1234' = lower('1234') limit 10; +------------+ | A | +------------+ | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | +------------+ 10 rows selected (0.135 seconds)

hbase(main):008:0> scan 'MYTABLE', {'LIMIT'=>5}

ROW  COLUMN+CELL   
 0  column=MYTABLE:ACTIVATION_DATE, timestamp=1489396441946, value=2011-10-12 12:42:15.0  
00000000000  column=MYTABLE:ACTIVATION_DATE, timestamp=1489396300572, value=2011-10-20 13:40:35.0  

Can't find the string via this statement

hbase(main):013:1> scan 'MYTABLE', {'STARTROW'=>'00000000139', 'ENDROW'=>'00000000139', LIMIT=>5}

<nothing>

Is it a whitespace padding issue perhaps? I am still investigating.

Re: lower() required to find numeric value in Phoenix table

It's probably not going to be a one-to-one mapping. The actual HBase row key will depend on your table DDL (e.g. multiple columns in the primary key constraint, salt buckets, and more). It doesn't look like the LOWER function does any trimming, I'd just agree that it seems like some sort of unprintable/non-obvious character in the string.

Re: lower() required to find numeric value in Phoenix table

Cloudera Employee

Phoenix uses its own encoding for the ascii fields and hence there is difference when you load the data directly to the hbase table. In this case it is required to load data into phoenix table instead of directly to hbase.

Since loading data directly from sqoop -> phoenix is not yet supported feature ( SQOOP-2649 ), one of the options here would be:

oracle -> csv -> phoenix

Don't have an account?
Coming from Hortonworks? Activate your account here