Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎12-02-2015

Hive - regex_extract URL

Hi Guys,

 

I kindly request your assistance on query below:

 

select
ip, url,
regexp_extract(url, "(/[a-zA-Z]+)/?") as page,
regexp_extract(url, "(/[a-zA-Z]+)/?") as subpage
from tokenized_access_logs limit 10;

 

the url example: /department/apparel/category/featured%20shops/product/adidas%20Kids'%20RG%20III%20Mid%20Football%20Cleat

 

How can I extract the page and subpage:

 

page: departament

subpage: apparel

 

Rgs,

Rodrigo