- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Deletion of records from hive table based on some condition
- Labels:
-
Apache Hive
Created 12-22-2017 06:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have a requirement to delete records from hive table based on some condition.
and we need to make sure that in future if the data will come with same ID it should not be inserted.
For deletion one approach is to load the required data in another table delete the original one and rename the new one.
But how will we ensure that in future if the data will come with same ID it should not be inserted.
Please suggest some new and optimized approach for deletions if any.
Thanks.
Created 12-22-2017 01:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Better to maintain all the deleted id's in one staging table and while loading the data into hive tables check whether id is already existed in staging table either by using Join or Merge clause
Created 12-22-2017 06:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if you are using 2.6 version or later, you can turn on ACID and execute delete commands. you can audit your delete via your app if it is needed. Otherwise, delete, merge and update can run on Hive directly with ACID.
Created 12-26-2017 10:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
