Reply
Highlighted
New Contributor
Posts: 4
Registered: ‎08-10-2016

Escaping non-ascii bytes in HiveQL query

Hello all --

 

Our HBase row key contains a series of bytes (ascii and non-ascii).

In a hex representation, it may look something like:

\x00R'\x7F\xBF

 

In the HBase shell, I can easily scan for this by using:

scan 'table', {LIMIT=> 10, STARTROW=>"\x00R'\x7F\xBF"}

 

But when using Hive, I can't seem to find a way to do a simple SELECT on this row.

Neither of the following seem to work.

SELECT * FROM table WHERE key = "\x00R'\x7F\xBF"
SELECT * FROM table WHERE key = "\\x00R'\\x7F\\xBF"

 

Using octal escaped characters in the form of "\###" works, but only for ascii characters (up to \177).

Therefore, this is not a viable solution, as some bytes are non-ascii (higher than \177).

SELECT * FROM table WHERE key = "\000\017\020\130\171"   # works
SELECT * FROM table WHERE key = "\000\130\077\256\120"   # does not work

 

Is there a way to do what I'm trying to accomplish?

 

 TL;DR

How do I format a Hive query like

SELECT * FROM table WHERE key = ???

where ??? is a series of non-ascii bytes?

 

Thanks!

New Contributor
Posts: 1
Registered: ‎12-05-2018

Re: Escaping non-ascii bytes in HiveQL query

[ Edited ]

This is probably late, but does 'uXXXX' syntax work for you?

 

So:

SELECT * FROM table WHERE key = "\u0000R'\u007F\u00BF"

 

instead of:

SELECT * FROM table WHERE key = "\x00R'\x7F\xBF"

 

Announcements