Created 10-03-2024 11:34 AM
For a hive2 cluster, I do not have to escape a double quote inside a regexp_replace where the single quotes to start and end the regex.
But on hive3 in order for the code to work, the double quote needs to be escaped (Example to follow).
Why the difference?
-- Works in Hive 2 but not hive3
select
id
, case when
url rlike '(?i)^https://www.linkedin.com/("?in|pub|company|profile)' then url
else regexp_replace(url,'^https://www.linkedin.com/','https://www.linkedin.com/in/')
end as social_url
from db.table
limit 10
-- Works in hive3
select
id
, case when
url rlike '(?i)^https://www.linkedin.com/(\"?in|pub|company|profile)' then url
else regexp_replace(url,'^https://www.linkedin.com/','https://www.linkedin.com/in/')
end as social_url
from db.table
limit 10
Created 10-03-2024 04:29 PM
@IanWilloughby Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Hive experts @Shmoo @cravani who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 10-07-2024 01:02 AM
@IanWilloughby Would it be possible for you to provide some sample records for the 'URL' column to help me gain a clearer understanding? Additionally, could you please share the specific versions of HIVE2 and HIVE3?
Created 10-10-2024 04:28 PM
@IanWilloughby If you are still experiencing the issue, can you provide the information @ggandharan has requested? Thanks.
Regards,
Diana Torres,