Support Questions
Find answers, ask questions, and share your expertise

HIVE UNIQUE Constraint allows duplicate data insertion.How the primary key and unique constraints are different from RDBMS systems in HIVE?

Contributor

create table prakash2(id int UNIQUE DISABLE NOVALIDATE,PRIMARY KEY(id) DISABLE NOVALIDATE);

 

insert into table prakash2 values(1);

insert into table prakash2 values(1);

insert into table prakash2 values(1);

 

+--------------+
| prakash2.id |
+--------------+
| 1 |
| 1 |
| 1 |
+--------------+
3 rows selected (0.149 seconds)

 

As per RDBMS concepts primary key itself has unique constraint but here in hive mentioning unique constraint explicitly allowing me to insert duplicate values.

 

How primary key and unique constraints are different HIVE apart from the traditional RDBMS systems?

1 REPLY 1

Mentor

@Prakashcit 

 

There is a Jira https://issues.apache.org/jira/browse/HIVE-16575 last updated on 05/Dec/19 Hive does not enforce foreign keys to refer to primary keys or unique keys.

In your previous thread, I explained what a NOVALIDATE constratriant is "A NOVALIDATE constraint is basically a constraint that can be enabled but for which hive will not check the existing data to determine whether there might be data that currently violate the constraint"

The difference between a UNIQUE constraint and a Primary Key is that per table you may only have one Primary Key but you may define more than one UNIQUE constraint. Primary Key constraints are not nullable.UNIQUE constraints may be nullable. Oracle also implements the NOVALIDATE constraint here is a write-up by Richard Foote

When you create a UNIQUE constraint, the database automatically creates a UNIQUE index. For RDBMS databases, a PRIMARY KEY will generate a unique CLUSTERED INDEX. A UNIQUE constraint will generate a unique NON-CLUSTERED INDEX.