Support Questions

Find answers, ask questions, and share your expertise

Hbase insert overwrite through hive view

avatar
New Contributor

Hi, I'm facing some data loss while doing the insert overwrite through multiple hive views and in case of some column qualifiers data is getting rolled back . Can we do data upserts through multiple hive views on single hbase table. Can anyone help me to understand this issue.

1 REPLY 1

avatar
Super Collaborator

 

  • Data Loss: When you perform an INSERT OVERWRITE operation in Hive, it completely replaces the data in the target table or partition. if the data is not correctly inserted, it can result in data loss.
  • Column Qualifiers: HBase stores data in a key-value format with rows, column families, and column qualifiers. Issues with specific column qualifiers could be due to schema mismatches or data type incompatibilities.

    Upserting Data: Upserting (update or insert) in HBase via Hive can be challenging since Hive primarily supports batch processing and doesn't have native support for upsert operations directly. As HBASE handlers tables are external tables. 

    Best Practices and Troubleshooting

    • Schema Matching: Ensure that the schema of the Hive table and the HBase table matches, especially the data types and column qualifiers.
    • Data Types: Be cautious with data types. HBase stores everything as bytes, so type conversions must be handled properly.

    • Error Handling: Implement proper error handling and logging to identify issues during data insertion.