Reply
Highlighted
New Contributor
Posts: 3
Registered: ‎07-16-2018
Accepted Solution

Regex not giving same results as other regex engines

I'm executing the following query in Hive.  Many of my field values in my actual table look like:

 

ASDF (1)

abc 123 xyz (10)

 

etc. etc.  I just want to replace any digits in parentheses at the end of a string with "x".

 

select 'SP0724_FA (1)' as str, regexp_replace(trim('SP0724_FA (1)'), ' [(]\d+[)]$', 'x') as str_2

 

The online testing app at regex101.com matches this.  However, Hive returns the same value for str and str_2.  I would expect to see "SP0724_FAx" as the str_2 value.  Please note the space at the beginning of the regex pattern.

 

Any ideas?

 

Thank you!

New Contributor
Posts: 3
Registered: ‎07-16-2018

Re: Regex not giving same results as other regex engines

Turned out to be pretty simple.  Hadoop (or at least Hive) prefers two backslashes like \\d rather than \d.  I ended up using this, which also requires one to four digits.

 

 [(]\\d{1,4}[)]$

Announcements