Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive - regex_extract URL

Hive - regex_extract URL

New Contributor

Hi Guys,

 

I kindly request your assistance on query below:

 

select
ip, url,
regexp_extract(url, "(/[a-zA-Z]+)/?") as page,
regexp_extract(url, "(/[a-zA-Z]+)/?") as subpage
from tokenized_access_logs limit 10;

 

the url example: /department/apparel/category/featured%20shops/product/adidas%20Kids'%20RG%20III%20Mid%20Football%20Cleat

 

How can I extract the page and subpage:

 

page: departament

subpage: apparel

 

Rgs,

Rodrigo