Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Regex not giving same results as other regex engines

Solved Go to solution
Highlighted

Regex not giving same results as other regex engines

New Contributor

I'm executing the following query in Hive.  Many of my field values in my actual table look like:

 

ASDF (1)

abc 123 xyz (10)

 

etc. etc.  I just want to replace any digits in parentheses at the end of a string with "x".

 

select 'SP0724_FA (1)' as str, regexp_replace(trim('SP0724_FA (1)'), ' [(]\d+[)]$', 'x') as str_2

 

The online testing app at regex101.com matches this.  However, Hive returns the same value for str and str_2.  I would expect to see "SP0724_FAx" as the str_2 value.  Please note the space at the beginning of the regex pattern.

 

Any ideas?

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Regex not giving same results as other regex engines

New Contributor

Turned out to be pretty simple.  Hadoop (or at least Hive) prefers two backslashes like \\d rather than \d.  I ended up using this, which also requires one to four digits.

 

 [(]\\d{1,4}[)]$

1 REPLY 1

Re: Regex not giving same results as other regex engines

New Contributor

Turned out to be pretty simple.  Hadoop (or at least Hive) prefers two backslashes like \\d rather than \d.  I ended up using this, which also requires one to four digits.

 

 [(]\\d{1,4}[)]$