Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive table becomes all NULLs after add new columns

avatar
New Contributor

I have a hive table with three columns, delimited by tabs

hive (database)> describe formatted my_table;
# col_name data_type comment
field1 string
field2 string
field3 string
... ...
Storage Desc Params:
input.regex (\\S+)\\s+(\\S+)\\s+(.*)

hive (database)> select * from my_table limit 2;
DataX DataX DataX
DataX DataX DataX

I want to add two new columns to the hive table schema, so I did:

// add two columns
hive (database)> alter table my_table add columns(field4 string,field5 string);
// add two more (\\S+)\\s+ for the new columns
hive (database)> alter table my_table SET SERDEPROPERTIES ("input.regex" = "(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(.*)");

But seems it doesn't work with the old dataset (the old data file with three columns):

hive (database)> select * from my_table limit 2;
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL

I expected something like this:

DataX DataX DataX NULL NULL
DataX DataX DataX NULL NULL

 Could someone help me understand why all the fields show NULL after adding the columns to schema? Thanks in advance!

1 REPLY 1

avatar
Master Collaborator

@SwaggyPPPP Is this a partitioned table? In that case you could run the ALTER TABLE  command as follows:

alter table my_table add columns(field4 string,field5 string) CASCADE;

Let us know if this issue occurs consistently, after adding new columns, and your Cloudera product version?