Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

update data in Hive using join

avatar
Expert Contributor

Hi,

We have an use case to refactor query from postgreSQL to Hive as we are moving to Hadoop.

HDP-2.5.0 installed and all the data have been moved to Hive. Next we are changing the queries. In the existing postgres function we have an update of data as below. How can I refactor that for Hive ? Looking for your guidance to accomplish this.

update pricing_reporting_final b set coverage_type=a.X_GEHC_COVERAGE_TYPE from ( select invc_num,X_GEHC_COVERAGE_TYPE from pricing_reporting_final where x_trx_cd='Invoice' group by 1,2) a where a.invc_num=b.Original_Invoice_Num and b.x_trx_cd in ('Rebill','Credit Memo');

3 REPLIES 3

avatar
Rising Star

avatar

Hi @rajdip chaudhuri

Update is supported only for hive table in which ACID property is enabled.

Also by seeing your querying im afraid as I hive will not support such use case as of now. However once MERGE statement is added you will be able to update using join.

Check on this Jira ticket .

https://issues.apache.org/jira/browse/HIVE-10924

If you use your existing query it will fail. Alternatively load the data after performing join into a temp table and then update the target based on your temp table.

avatar
Expert Contributor

With the Hive (newer than Hive 2.2),

you can use Merge INTO

MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
    UPDATE SET
        target.name = source.name,
        target.age = source.age
WHEN NOT MATCHED THEN
    INSERT (id, name, age)
    VALUES (source.id, source.name, source.age);