Support Questions

Find answers, ask questions, and share your expertise

Need to extract words from a column containing multiple words without the first word

avatar
Explorer

I am doing a couple of operations of which I am extracting the first word by using

split(item_desc, ' ')[0]

I also need to extract all words excluding the first word. Can anyone suggest a way please?

Regards,

Souveek

3 REPLIES 3

avatar
Explorer

avatar
Master Guru
@Souveek Ray

Use regexp_extract function as you are doing split on space, write an matching regex to extract until first space as first word and next capture group excludes the first word.

Example:

I'm having firstword secondword thirdword as a value

hive> select regexp_extract("firstword secondword thirdword","^(.*?)\\s(.*)",1)first_word,
regexp_extract("firstword secondword thirdword","^(.*?)\\s(.*)",2) not_first_word; +-------------+-----------------------+--+ | first_word | not_first_word | +-------------+-----------------------+--+ | firstword | secondword thirdword | +-------------+-----------------------+--+

So in the above example we have extracted the first word by using regexp_extract and 1 capture group and in the next we have used regexp_extract and 2 capture group(excluding firstword).

avatar
Explorer

Hi,

Thank you, I figured this out using substring

substr(item_desc,instr(item_desc,' ')+1,length(item_desc))

This is giving me the desired results

Regards

Souveek