Support Questions

Find answers, ask questions, and share your expertise

Why does Impala 2.0 regex .*? behavior differ from a typical implementation

avatar
New Contributor

From the docs at https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_string_functions.html#string_functions__regexp_extract

 

"This example shows how a pattern string starting with .*? matches the shortest possible portion of the source string, returning the rightmost set of lowercase letters."

    
select regexp_extract('AbcdBCdefGHI','.*?([[:lower:]]+)',1);

 

returns def


Every other Regex impl I've worked with would return bcd

 

I can't make sense of the docs either - "shortest possible string.. returns the rightmost.." - the shortest possible string in a "search from the left" operation returns the leftmost, not the rightmost 

 

 

 

 

 

5 REPLIES 5

avatar
Super Guru

@cjard ,

 

I believe this is a bug. Check this out: https://issues.apache.org/jira/browse/IMPALA-2917

I agree that the documentation needs to be fixed as well.

 

André

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Community Manager

I've advised the documentation team of the confusing wording for you. Thank you for letting us know. 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

Thanks all!

 

I'm curious - does it mean that the documentation will be heavily rewritten to detail how .*? implementation is atypical, or will the bug be fixed so the documentation can be tweaked to be correct? 

avatar
Community Manager

That will be up to the documentation team. I alerted them to your concerns so they can look into it. 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

Fingers crossed it can push the reported bug up the review list a bit then! 🙂 Thanks for the excellent help and attention..