Hi,
Need help on below query.
We have a scenario where for any column defined with datatype as string in Hadoop, NULL value is loaded as blank. However, any column other than string datatype (INT,DOUBLE etc) NULL values are loaded as NULL in Hadoop.
Column name Data Type
service1end string
service1start string
service2end string
service2start string
firstlinemaintcost double
Is this a default behavior of Hadoop/hive?
Created 03-09-2020 05:24 AM
Created 03-09-2020 02:41 AM
@ARVINDR ,
Yes, this is a default behaviour in Hive. When an empty value ( '' ) is passed for STRING datatype, it is taken as it is.
If you would like the null values for STRING to be treated as NULL instead of just empty, you can set this property - serialization.null.format for the table of your choice. For example, you can do this on a new table or to an existing table by running a statement like the below,
New Table:
CREATE TABLE null_test_1
(service1end string,
service1start string,
service2end string,
service2start string,
firstlinemaintcost double)
TBLPROPERTIES (
'serialization.null.format'='');
Existing Table:
alter table null_test set tblproperties ('serialization.null.format'='');
Once done, the empty values inserted for the columns with STRING datatype will be shown as NULL.
Hope this helps!
Created 03-09-2020 05:24 AM