Created on 05-02-2017 08:50 AM - edited 09-16-2022 04:32 AM
Hi,
We have an use case to refactor query from postgreSQL to Hive as we are moving to Hadoop.
HDP-2.5.0 installed and all the data have been moved to Hive. Next we are changing the queries. In the existing postgres function we have an update of data as below. How can I refactor that for Hive ? Looking for your guidance to accomplish this.
update pricing_reporting_final b set coverage_type=a.X_GEHC_COVERAGE_TYPE from ( select invc_num,X_GEHC_COVERAGE_TYPE from pricing_reporting_final where x_trx_cd='Invoice' group by 1,2) a where a.invc_num=b.Original_Invoice_Num and b.x_trx_cd in ('Rebill','Credit Memo');
Created 05-02-2017 06:01 PM
update is supported for ACID tables in hive, https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update
Created 05-02-2017 07:51 PM
Update is supported only for hive table in which ACID property is enabled.
Also by seeing your querying im afraid as I hive will not support such use case as of now. However once MERGE statement is added you will be able to update using join.
Check on this Jira ticket .
https://issues.apache.org/jira/browse/HIVE-10924
If you use your existing query it will fail. Alternatively load the data after performing join into a temp table and then update the target based on your temp table.
Created 09-08-2024 10:36 PM
With the Hive (newer than Hive 2.2),
you can use Merge INTO
MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET
target.name = source.name,
target.age = source.age
WHEN NOT MATCHED THEN
INSERT (id, name, age)
VALUES (source.id, source.name, source.age);