Hi, I've a table countries with the following structure:
CREATE TABLE countries( id INT, code CHAR(2), name STRING, continent CHAR(2), wikipedia_link STRING, keywords STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
My objective is deleting the wikipedia_link column with the following command:
ALTER TABLE countries DROP COLUMN wikipedia_link
But, when viewing the contents of the table, I realise that, now, the column keywords contains the wikipedia links for each country, instead of the values of the keyword for each country. It seems that only the column is being removed, but not the content of it -my objective is removing both- . To sum up: only the header is being removed, and the following one (keywords) is being shifted to the left.
I've tried, too, altering the columns of the table with this command:
ALTER TABLE countries REPLACE COLUMNS ( id INT, code CHAR(2), name STRING, continent CHAR(2), keywords STRING)
But the result is the same: I'm still seeing the values of wikipedia link for each country in the column "keywords".
Any idea on how to solve this problem?
Even when we replace columns in hive text table the data will not be changed i.e wikipedia_link data will be still presented in HDFS file.
So when we try to access the table hive reads the data with "," delimited and gives wikipedia_link data in place of keywords column.
Steps to drop wikipedia_link column with data:
hive> set hive.support.quoted.identifiers=none; hive> create table countries_temp row format delimited fields terminated by ',' stored as textfile as select `(wikipedia_link)?+.+` from countries; //create temp table with out wikipedia_link columnhive> drop table countries; //drop countries table hive> alter table countries_temp rename to countries; //rename temp table to countries.
Another way would be creating a view without wikipedia_link column.
hive> create view vw_countries as select id,code,name,continent,keywords from countries;
then accessing data from view instead of countries table.