Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Highlighted

How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Explorer

We have a Hive table with some columns being arrays and structs. When inserting a row into the table, if we do not have any value for the array and struct column and want to insert a NULL value for them, how do we specify in the INSERT statement the NULL values?

For e.g., if the table definition is:

C1 string

C2 struct<c2_a:array<string>, c2_c:string>

C3 array<struct<c3_a:string, c3_b:string>>

then to insert NULL value to C2 and C3 columns, I was expecting to have the INSERT statement as: “insert into table some_test_table select "c1_val", NULL, NULL from z_dummy;” where z_dummy is a 1 row table. But Hive reports a syntax error.

Any idea on how to specify the correct syntax to use for the above example?

8 REPLIES 8
Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Hive does not support literals for complex types (array, map, struct, union), so it is not possible to use them in INSERT INTO...VALUES clauses. This means that the user cannot insert data into a complex datatype column using the INSERT INTO...VALUES clause. link

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Explorer

Hive does support literals for complex types. I didnt find any documentation for it, but I have the syntax that works.

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

New Contributor

When you load data into hive table the default Missing values are represented by the special value NULL. SO for simple or complex queries its not possible to insert null values into hive tables using INSERT INTO clause.

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Contributor

@Ravi Chinni

insert into some_test_table
select 'c1_val',named_struct('c2_a',array(cast (null as string)),'c2_c',cast (null as string)),array(named_struct('c3_a',cast (null as string) ,'c3_b',cast (null as string))) from z_dummy;

This will work for you.

Needless to say, please upvote if the answer was useful. :)

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Explorer

@Sagar Morakhia

When you say 'array(cast (null as string))', aren't we inserting one element, whose value is null into the array? The array size here is 1 and the array itself isn't null. For e.g. when I select the row that was inserted using the above statement, I get the following:

hive> select C2, C3 from complex_null_test;

{"c2_a":null,"c2_c":null} [{"c3_a":null,"c3_b":null}]

Notice in the above that the array itself has 1 element of struct type whose values are null.

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

New Contributor
@Ravi Chinni

When you load data the default Missing values are represented by the special value NULL.

For Example: Find the sample data:

1,abc,4000,maths$physics$chemistry,hyd

2,hadoop,3000,maths$physics,bang

3,mapreduce,4000,social,bang

For the above data the table create statement: create table array_table(id int,name string,sal int,sub array<string>, city string) row format delimited fields terminated by ',' collection items terminated by '$';

Load statement:

load data local inpath '/Users/abc/Downloads/arrayfile.txt' into table array_table;

hive> select * from array_table;

OK

1abc40000["maths","physics","chemistry"]hyd

2hadoop3000["maths","physics"]bang

3mapreduce4000["social"]bang

Time taken: 0.086 seconds, Fetched: 3 row(s)

hive> select * from array_table where sub[2] is not null;

OK

1abc40000["maths","physics","chemistry"]hyd

Time taken: 0.091 seconds, Fetched: 1 row(s)

hive> select * from array_table where sub[1] is not null;

OK

1abc40000["maths","physics","chemistry"]hyd

2hadoop3000["maths","physics"]bang

Time taken: 0.089 seconds, Fetched: 2 row(s)

Is this what you are looking for.? Can you kinldy respond back if it meets your requirement.?

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

Explorer

How do you load data into the above table, with data for some complex columns missing?

Highlighted

Re: How to insert NULL value into Hive complex columns (array & struct),Having NULL value for Hive complex column (struct & array)

New Contributor

@Ravi Chinni

When you load data into hive table the default Missing values are represented by the special value NULL.

For Example:

1,abc,40000,maths$physics$chemistry,hyd

2,hadoop,3000,maths$physics,bang

3,mapreduce,4000,social,bang

Find the Below Queries:

hive> create table array_table(id int,name string,sal int,sub array<string>, city string) row format delimited fields terminated by ',' collection items terminated by '$';

OK

Time taken: 0.06 seconds

hive> load data local inpath '/Users/abc/sample.txt' into table array_table;

Loading data to table 7may.array_table

Table 7may.array_table stats: [numFiles=1, totalSize=106]

OK

Time taken: 0.146 seconds

hive> select * from array_table;

OK

1abc40000["maths","physics","chemistry"]hyd

2hadoop3000["maths","physics"]bang

3mapreduce4000["social"]bang

Time taken: 0.121 seconds, Fetched: 3 row(s)

hive> select * from array_table where sub[2] is not null;

OK

1abc40000["maths","physics","chemistry"]hyd

Time taken: 0.091 seconds, Fetched: 1 row(s)

hive> select * from array_table where sub[1] is not null;

OK

1abc40000["maths","physics","chemistry"]hyd

2hadoop3000["maths","physics"]bang

Time taken: 0.089 seconds, Fetched: 2 row(s)

Kindly reply if it is useful for your requirement.

Don't have an account?
Coming from Hortonworks? Activate your account here