Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Struct datatype insert in hive

Highlighted

Struct datatype insert in hive

I have created a table create table data ( id int,name string, addreess struc<city:string,State:string>) rows format delimited fields terminated by ',' collections terminated by ',' stored as 'TEXTFILE';

If I use insert into table data select 1,'Bala',named_struct('city','Tampa','state','FL') from anothertable limit 1; Is this correct way to insert a record into hive table. Because when I do that address.state is inserted as null

2 REPLIES 2
Highlighted

Re: Struct datatype insert in hive

Contributor

Hive does not support literals for complex types (array, map, struct, union), so it is not possible to use them in INSERT INTO...VALUES clauses. This means that the user cannot insert data into a complex datatype column using the INSERT INTO...VALUES clause. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Insertingdatai...

Re: Struct datatype insert in hive

Hi @Bala Vignesh N V. I know this is an old question, but I have encountered this recently. This answer may help someone else as well...

The issue you had is most likely caused by specifying " COLLECTION ITEMS TERMINATED BY ',' ".

When the table is defined like this (with COLLECTION ITEMS TERMINATED BY comma):

-- Create a Dummy table to use in the insert query - like an Oracle DUAL table
create table dummy_TBL (col1 int) ;
insert into dummy_TBL (col1) values(1) ; 

create table data_TBL (id int, name string, address struct<city:string,State:string>) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY ','
STORED AS TEXTFILE; 

insert into table data_TBL 
select 1,
      'Bala',
      named_struct('city','Tampa','State','FL') 
from dummy_TBL limit 1;

The address.state value is NULL:

+--------------+----------------+--------------------------------+--+
| data_tbl.id  | data_tbl.name  |        data_tbl.address        |
+--------------+----------------+--------------------------------+--+
| 1            | Bala           | {"city":"Tampa","state":null}  |
+--------------+----------------+--------------------------------+--+

But when you define the table like this (without COLLECTION ITEMS TERMINATED BY comma):

create table data_TBL (id int, name string, address struct<city:string,State:string>) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ','  
STORED AS TEXTFILE; 

insert into table data_TBL 
select 1,
      'Bala',
      named_struct('city','Tampa','State','FL') 
from dummy_TBL limit 1;

The address.state value is correct:

+--------------+----------------+--------------------------------+--+
| data_tbl.id  | data_tbl.name  |        data_tbl.address        |
+--------------+----------------+--------------------------------+--+
| 1            | Bala           | {"city":"Tampa","state":"FL"}  |
+--------------+----------------+--------------------------------+--+

I hope this helps.

Don't have an account?
Coming from Hortonworks? Activate your account here