Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive - regex_extract URL

Hive - regex_extract URL

New Contributor

Hi Guys,


I kindly request your assistance on query below:


ip, url,
regexp_extract(url, "(/[a-zA-Z]+)/?") as page,
regexp_extract(url, "(/[a-zA-Z]+)/?") as subpage
from tokenized_access_logs limit 10;


the url example: /department/apparel/category/featured%20shops/product/adidas%20Kids'%20RG%20III%20Mid%20Football%20Cleat


How can I extract the page and subpage:


page: departament

subpage: apparel