Created 07-19-2018 09:31 AM
I have a hive table created with OpenCSVSerde. When I upload the CSV to hive with
load data local inpath 'final.csv' OVERWRITE into table etltemp.table_staging;
There are some NULL rows (all fields are null) found. I sampled check and "grep" the CSV, and did not spot any exceptional rows nor empty line. Have spent some time to google and docs, and I can't find any method to debug/investigate on NULL row issue in hive.
Do we have any ways to identify which rows in the "final.csv" have issues? Or what kind of data issue will make Serde to treat it as null? Thanks
Created 07-19-2018 12:45 PM
@Dennis Sin Hive does not support not null constraint yet.
https://issues.apache.org/jira/browse/HIVE-6905
I would create a view on the table to exclude the null records and compare view against your csv.
Created 07-20-2018 01:44 AM
Thanks for the feature update o Hive. This feature is really desirable. I think i need to discard these null records for now.