Support Questions
Find answers, ask questions, and share your expertise

How to do a word count with Apache Phoenix

How to do a word count with Apache Phoenix

New Contributor

Hi All, I'm trying to split words of a record in Apache Phoenix and count its occurrences. I do not see a split() function in Phoenix so I'm using REGEXP_SPLIT(line,' ') which returns an Array. How do I group words further and count how many times a word occurs?

select REGEXP_SPLIT(line,' ') from twitter_feed;
OutPut: 

	[@Tea,best,is]

	[@Jimmy,should,get,over,it]
1 REPLY 1
Highlighted

Re: How to do a word count with Apache Phoenix

Expert Contributor

Try:

SELECT ARRAY_LENGTH( REGEXP_SPLIT(line, ' ')) FROM twitter_feed;

This is a common SQL challenge. Count will be incorrect if you have more than one spaces between words.