I am trying to analyze if we can have a NOT NULL constraints in Hive tables.
In the below link of Apache Hive page,
It has a "constraint_specification" for the Create table statement for primary_keyt etc. I tried but couldn't find the "constraint_specification" in Cloudera site for Hive. My questions are
1. Is it possible to create a Hive table with Not Null constraints? If yes, can you please provide me syntax.
2. What are the constraints are supported in Hive table? If possible please provide me the link which has the details on constraints supported by Hive for Cloudera distribution.
Appreciate you help.
Hive is designed for schema on Read. Meaning Hive has not control over the underlying storage .
You can damage the data and still managed to query using hive . Let say if the schema does not match the file contents then Hive will try its best to read it. going down fruther it will produce null values if its non numberic strings. Where as in traditional database you write update insert and the database has control over the storage it will enforce the schema while writing thats why it is schema on write. So to sum up you wont be ablecreate Not Null constraints hive table and enforce it by design .
They have this issue fixed in hive 3.0.0 , They have been able to add hive support for "Not NULL " & "UNIQUE" constraint . Check out this link