Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Not able to understand the Regexp_extract sysntax,

avatar
New Contributor

regexp_extract(col_value,'^(?:([^,]*),?){1}',1)

,
1 ACCEPTED SOLUTION

avatar

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

regexp_extract(string subject, string pattern, int index)

Returns the string extracted using the pattern. For example, regexp_extract('foothebar', 'foo(.*?)(bar)', 2) returns 'bar.' Note that some care is necessary in using predefined character classes: using '\s' as the second argument will match the letter s; '\\s' is necessary to match whitespace, etc. The 'index' parameter is the Java regex Matcher group() method index. See docs/api/java/util/regex/Matcher.html for more information on the 'index' or Java regex group() method.

In your case it will return everything from the start until the first comma (comma included). For example if your text is "abc,def,geh", it will return "abc,".

Hope this helps.

View solution in original post

4 REPLIES 4

avatar

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

regexp_extract(string subject, string pattern, int index)

Returns the string extracted using the pattern. For example, regexp_extract('foothebar', 'foo(.*?)(bar)', 2) returns 'bar.' Note that some care is necessary in using predefined character classes: using '\s' as the second argument will match the letter s; '\\s' is necessary to match whitespace, etc. The 'index' parameter is the Java regex Matcher group() method index. See docs/api/java/util/regex/Matcher.html for more information on the 'index' or Java regex group() method.

In your case it will return everything from the start until the first comma (comma included). For example if your text is "abc,def,geh", it will return "abc,".

Hope this helps.

avatar
New Contributor

Hi Pierre,

thanks for look into my query. Yes it is very much clear to me except one doubt .

i am not clear with ?: in my query and (.*?) in your example.

Sorry for asking very basic things but if you could give me some briefthat can be helpful in writing some other functions.

Regards

Sachin Mittal

avatar

I'd recommend you having a look to this site : http://regexr.com/

You can enter your regular expression and then click on "Explain" (at the bottom) to have a complete explanation about the regular expression you entered. It also gives you the possibility to test your regular expression with any text you want.

Hope this helps.

avatar
New Contributor

Hi Pierre,

Very nice of you.

Thanks a lot. I visited the site and cleared my most of the doubts.

Regards

Sachin Mittal