Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Problem removing column from Hive table


Problem removing column from Hive table

New Contributor

Hi, I've a table countries with the following structure:

CREATE TABLE countries( 
	id INT, 
	code CHAR(2), 
	name STRING,
	continent CHAR(2), 
	wikipedia_link STRING, 
	keywords STRING) 

My objective is deleting the wikipedia_link column with the following command:

ALTER TABLE countries DROP COLUMN wikipedia_link

But, when viewing the contents of the table, I realise that, now, the column keywords contains the wikipedia links for each country, instead of the values of the keyword for each country. It seems that only the column is being removed, but not the content of it -my objective is removing both- . To sum up: only the header is being removed, and the following one (keywords) is being shifted to the left.

I've tried, too, altering the columns of the table with this command:

        id INT, 
        code CHAR(2), 
        name STRING, 
        continent CHAR(2), 
        keywords STRING)

But the result is the same: I'm still seeing the values of wikipedia link for each country in the column "keywords".

Any idea on how to solve this problem?



Re: Problem removing column from Hive table

Super Guru

Even when we replace columns in hive text table the data will not be changed i.e wikipedia_link data will be still presented in HDFS file.

So when we try to access the table hive reads the data with "," delimited and gives wikipedia_link data in place of keywords column.


Steps to drop wikipedia_link column with data:

hive> set;

hive> create table countries_temp row format delimited fields terminated by ',' stored as textfile as select `(wikipedia_link)?+.+` from countries; //create temp table with out wikipedia_link columnhive> drop table countries; //drop countries table
hive> alter table countries_temp rename to countries; //rename temp table to countries.


Another way would be creating a view without wikipedia_link column.

hive> create view vw_countries as select id,code,name,continent,keywords from countries;

then accessing data from view instead of countries table.