Created 06-04-2018 02:04 PM
Hi,
I am trying to create a view in Hortonworks that I have in postgres greenplum where I am trying to find a word/words that exists in a column that has lot of words. If the word exists, the new column should show the word and if there are multiple words, the multiple words should be reflected with a space in between. Below is a running code in postgres Greenplum. Can anyone help me with the code in Hive?
CREATE OR REPLACE VIEW XYZ_V AS( Select col1, col2, col3, col4,trim(both ' ' from
case when substring( lower(col4) from 'abcd' ) is null then '' else substring( lower(col4) from 'abcd') end ||' '||
case when substring( lower(col4) from 'pqrs' ) is null then '' else substring( lower(col4) from 'pqrs') end ||' '||
case when substring( lower(col4) from 'mnop' ) is null then '' else substring( lower(col4) from 'mnop') end) as col5
FROM XYZ);
Thanks a lot in advance
Regards
Created 06-04-2018 03:49 PM
Hey @Souveek Ray!
You can change your CREATE OR REPLACE VIEW to CREATE VIEW, in hive they have the same effect (doesn't need the word REPLACE) 🙂
TRIM Function not needed to use 'both' for Hive, the function TRIM in hive applies in both sides automatically.
LOWER function just put the value from COL4
CASE WHEN are the same in this case.
And your substring function, I'm not sure about the logic in your query, but, the sintax in hive are SUBSTR(value, init_pos, len). E.g. SUBSTR('Vinicius', 1,3) = Vin.
Below some good docs about Hive SQL 🙂
https://hortonworks.com/blog/hive-cheat-sheet-for-sql-users/
https://cwiki.apache.org/confluence/display/Hive/LanguageManual
Hope this helps 🙂
Created 06-05-2018 01:06 PM
Thank you for replying but my main concern is to replicate the part using substring in Hive. I can't figure out a way to get the exact output
Created 06-05-2018 01:38 PM
Hmmm I see, so could you share some results from Postgresql? Maybe we can figure out another way do to this 🙂
Created 06-05-2018 01:51 PM
Let me give you an example of the code that is working in postgres based on the code I gave.
CREATE OR REPLACE VIEW XYZ_V AS( Select col1, col2, col3, col4,trim(both ' ' from
case when substring( lower(col4) from 'abcd' ) is null then '' else substring( lower(col4) from 'abcd') end ||' '||
case when substring( lower(col4) from 'pqrs' ) is null then '' else substring( lower(col4) from 'pqrs') end ||' '||
case when substring( lower(col4) from 'mnop' ) is null then '' else substring( lower(col4) from 'mnop') end) as col5
FROM XYZ);
Example:
COL1\t COL2\t COL3\t COL4\t COL5
1\t pdf\t page1\t efgh opsdw mnop\t mnop
2\t txt\t page4\t abcd qwerty pqrs\t abcd pqrs
3\t txt\t page3\t abcd pqrs xyzsd mnop\t abcd pqrs mnop
(\t denotes the break in columns)
Regards,
Souveek
Created 06-06-2018 05:22 PM
Hey @Souveek Ray
Maybe this code can help you (it's just a scratch)
select concat_ws(' ',case when regexp_extract('abcd pqrs xyzsd mnop', '(abcd)', 0) is null then '' else 'abcd' end, case when regexp_extract('abcd pqrs xyzsd mnop', '(mnop)', 0) is null then '' else 'mnop' end, case when regexp_extract('abcd pqrs xyzsd mnop', '(pqrs)', 0) is null then '' else 'pqrs' end)
Created 06-07-2018 07:48 AM
Thank you for taking out your time to help me. The regexp_extract didnt work correctly but I got the idea and modified the code to get the desired results.
Here is what I did:
trim(concat_ws(' ', case when instr( lower(col4) , 'abcd' ) >0 then 'abcd' else '' end,
case when instr( lower(col4) , 'pqrs' ) >0 then 'pqrs' else '' end))
This worked perfectly. Thank you so much
Regards
Souveek
Created 06-07-2018 03:00 PM
Awesome! 🙂 Good to know @Souveek Ray