Support Questions

Find answers, ask questions, and share your expertise

Find a keyword in a column in Hive table

avatar
Contributor

I have a Hive table with a number of columns where column X contains a large string of text with many spaces between each word, all delimiters have been removed and all that remains are 0-9 a-z and A-Z characters. I would like to query Column X for a keyword Y and Y^ (eg Java and Javascript) and count the number of unique users from column Z that have mentioned these two words. Either of these words need to be mentioned once in column X for it to be counted. I would then like to count the total number of users that mentioned either of these two words.

1 ACCEPTED SOLUTION

avatar
Master Guru

Hi @John Garrigan, your specs are a little imprecise, but can you try this:

select count(distinct Z) from t1 where locate(Y, X)>0;   -- for example:
select count(distinct Z) from t1 where locate("Java", X)>0;

Note that, as stated, your condition on "Y^" (usually expressed as "Y*") is superfluous, since if Y is there then any superstring of Y will be there too.

View solution in original post

4 REPLIES 4

avatar
Master Guru

Hi @John Garrigan, your specs are a little imprecise, but can you try this:

select count(distinct Z) from t1 where locate(Y, X)>0;   -- for example:
select count(distinct Z) from t1 where locate("Java", X)>0;

Note that, as stated, your condition on "Y^" (usually expressed as "Y*") is superfluous, since if Y is there then any superstring of Y will be there too.

avatar
Contributor

Hi Predrag, many thanks for your help this worked perfectly.

avatar
Explorer

hi @Predrag Minovic

how can i locate some word but in a table, no in a column?

select count(distinct Z)from t1 where locate("Java", X)>0;

thank you.

avatar
Contributor

hi,

 

Were you able to find a solution for this ?