Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive - Word split and count

Hive - Word split and count

Hi all,

I am trying to perform a version of the word count function in Hive. Ii have the following fields: Owner_key and Post. I want to split the post into its individual words and then group by each UserId along with giving a count of each word. For example, say if this was my data:

Owner_keyPost
1apple orange apple
2melon kiwi

I would like the following output:

Owner_keywordcount
1apple2
1orange1
2melon1
2kiwi1

The code I have attempted is below. Hive is not necessarily giving me an error message; however it never shows me any results even when the status is at 100%.

Can anyone help?

Thanks in advance.

SELECT owner_key, word, 

count(*)FROM stackdata_updtd

LATERAL VIEW explode(split(lower(post), '\\W+')) t1 AS word

GROUP BY owner_key, word;
1 REPLY 1
Highlighted

Re: Hive - Word split and count

Hi @Maeve Ryan, it worked like a charm for me. Are you using the Hive view?

3016-2016-03-27-15-04-52.png

Don't have an account?
Coming from Hortonworks? Activate your account here