Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

alternative or substitute of REGEXP_INSTR function in Hadoop Hive

Highlighted

alternative or substitute of REGEXP_INSTR function in Hadoop Hive

New Contributor

Hi,

Please help me as I want to use an alternative or substitute of REGEXP_INSTR function in Hadoop Hive since HIVE does not support REGEXP_INSTR.

Suhag

1 REPLY 1

Re: alternative or substitute of REGEXP_INSTR function in Hadoop Hive

Super Guru

@Suhag Pandya

You can use instr function in hive to return the first occurance of the substring in your string.

select instr(string str, string substr);

Example:-

hive> select instr('foobar','o'); //returns first position of o in foobar string
+------+--+
| _c0  |
+------+--+
| 2    |
+------+--+

To return the first position of the substring from nth position in your string then use locate function in hive

Example:-

hive> select locate('o','fooobar',4);//returns the first occurance from 4th position 
+------+--+
| _c0  |
+------+--+
| 4    |
+------+--+

For case insensitive use lower function with locate function

Example:-

hive> select locate('o',lower('FOOOBAR'),4);
+------+--+
| _c0  |
+------+--+
| 4    |
+------+--+

There is no built in function that can gives your nth occurrence and match regex of the substr in hive yet, but you can create UDF for this functionality in hive.

-

If the Answer addressed your question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Don't have an account?
Coming from Hortonworks? Activate your account here