Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Find a keyword in a column in Hive table

Solved Go to solution
Highlighted

Find a keyword in a column in Hive table

New Contributor

I have a Hive table with a number of columns where column X contains a large string of text with many spaces between each word, all delimiters have been removed and all that remains are 0-9 a-z and A-Z characters. I would like to query Column X for a keyword Y and Y^ (eg Java and Javascript) and count the number of unique users from column Z that have mentioned these two words. Either of these words need to be mentioned once in column X for it to be counted. I would then like to count the total number of users that mentioned either of these two words.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Find a keyword in a column in Hive table

Hi @John Garrigan, your specs are a little imprecise, but can you try this:

select count(distinct Z) from t1 where locate(Y, X)>0;   -- for example:
select count(distinct Z) from t1 where locate("Java", X)>0;

Note that, as stated, your condition on "Y^" (usually expressed as "Y*") is superfluous, since if Y is there then any superstring of Y will be there too.

3 REPLIES 3

Re: Find a keyword in a column in Hive table

Hi @John Garrigan, your specs are a little imprecise, but can you try this:

select count(distinct Z) from t1 where locate(Y, X)>0;   -- for example:
select count(distinct Z) from t1 where locate("Java", X)>0;

Note that, as stated, your condition on "Y^" (usually expressed as "Y*") is superfluous, since if Y is there then any superstring of Y will be there too.

Re: Find a keyword in a column in Hive table

New Contributor

Hi Predrag, many thanks for your help this worked perfectly.

Re: Find a keyword in a column in Hive table

New Contributor

hi @Predrag Minovic

how can i locate some word but in a table, no in a column?

select count(distinct Z)from t1 where locate("Java", X)>0;

thank you.

Don't have an account?
Coming from Hortonworks? Activate your account here