Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive table: weird behaviour after adding column

Hive table: weird behaviour after adding column

New Contributor

I had a perfectly working ACID table. However, after some time it appeared that a few more additional columns were required. So we added them through

ALTER TABLE example_table ADD COLUMNS (newCol1 String, newClol2 String, ...)

The columns where added and everything seemed to work fine, until unusual behaviour started to happen: some (random) updates for these columns stopped registering (NULL was added, or nothing happened at all, while the logs showed normal values, not null's), with once in a while every row for ONLY these new columns turning to NULL, deleting whatever was written before. Tried fiddling around for quite some time without luck, the only solution was to recreate the table from scratch with all the required columns.

The question: why was this happening? Is there a better way to add new columns to existing ACID ORC table without recreating it? My (probably incorrect) idea is that due to bucketing and partitioning the data for these new values were stored in different files than when using a fresh new table and hive sometimes messed up the read or write/updates to that table. But I'm not sure how to replicate this and report for it to be fixed, if it was a bug. If it's not, what could it be? I'm using HIVE LLAP from HDP 2.6.2

Don't have an account?
Coming from Hortonworks? Activate your account here