Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

load array of values into HIVE

avatar
Explorer

Hi All,

I did load the below XML file into HIVE table” test_table.customer_details” with a schema

Hive> Create table test_table.customer_details ( Customers array<struct<customer_details:struct<first_name:string,last_name:string,DOB:string,Addr1:string,City:string,state:string,country:string>>> )

row format serde 'com.ibm.spss.hive.serde2.xml.XmlSerDe'

with serdeproperties

(

“Column.xpath.customerdetails” = “/FormResponse/Customers/customer_details”

)

INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'

TBLPROPERTIES (

"xmlinput.start"="<FormResponse>",

"xmlinput.end"="</FormResponse>"

);

XML file

<FormResponse>

<Customers>

<customer_details>

<First_name>Anji</First_name>

<Last_name> Raju</Last_name>

<DOB>06/24/1278<DOB>

<Addr1> 14 duck st </Addr1>

<City> boston </City>

<State> OH </State>

<Country> USA </Country>

</customer_details>

<customer_details>

<First_name>Jeet</First_name>

<Last_name> Anu</Last_name>

<DOB>06/24/1279<DOB>

<Addr1> tuttles groove </Addr1>

<City> denver </City>

<State> CA </State>

<Country> USA </Country>

</customer_details>

<customer_details>

<First_name>Test1</First_name>

<Last_name> Test_last</Last_name>

<DOB>006/24/1280<DOB>

<Addr1> Sleek street </Addr1>

<City> cali </City>

<State> MA </State>

<Country> USA </Country>

</customer_details>

</Customers>

</FormResponse>

When I write the command:

Hive> Select explode(customers.customer) from test_table.customer_details;

The output is

{“First_Name”:”Anji”,”Last_name”:”Raju”,”DOB”:” 06/24/1278”,”Addr1”:” 14 duck st”,”City”:”boston”,”State”:”OH”,”Country”:”USA”}

{“First_Name”:”Jeet”,”Last_name”:”Anu”,”DOB”:” 06/24/1279”,”Addr1”:” tuttles grove”,”City”:”denver”,”State”:”CA”,”Country”:”USA”}

{“First_Name”:”Test1”,”Last_name”:”Test_last”,”DOB”:” 06/24/1280”,”Addr1”:” Sleek street”,”City”:”cali”,”State”:”MA”,”Country”:”USA”}

I did create a new table

create table test_table.new ( customer array<struct<first_name:string,last_name:string,DOB:string, Addr1:string, city:string, State:string, Country:string >>);

I did try writing a hive command something like below to get the expected output but I am getting an Error:

hive> insert into test_table.new select explode( customers.customer_details) from test_table.customer_details;

FAILED: SemanticException [Error 10044]: Line 1:12 Cannot insert into target table because column number/types are different 'test_mess1': Cannot convert column 1 from array<struct<first_name:string,last_name:string,DOB:string,Addr1:string,city:string,State:string,Country:string >> to

array<struct<first_name:string,last_name:string,DOB:string, Addr1:string, city:string, State:string, Country:string >>

The Expected output should be:

First_name Last_name DOB Addr1 City State Country
Anji Raju 06/24/1278 14 duck st boston OH USA
Jeet Anu 06/24/1279 tuttles groove denver CA USA
Test1 Test_lat 06/24/1280 Sleek street cali MA USA


Could someone help me to resolve this issue???

4 REPLIES 4

avatar

Hello @Anji Raju!
Try this out:

#My XML content (don't forge to close your DOB TAG!)
<FormResponse> <Customers> <customer_details> <First_name>Anji</First_name> <Last_name> Raju</Last_name> <DOB>06/24/1278</DOB> <Addr1> 14 duck st </Addr1> <City> boston </City> <State> OH </State> <Country> USA </Country> </customer_details> <customer_details> <First_name>Jeet</First_name> <Last_name> Anu</Last_name> <DOB>06/24/1279</DOB> <Addr1> tuttles groove </Addr1> <City> denver </City> <State> CA </State> <Country> USA </Country> </customer_details> <customer_details> <First_name>Test1</First_name> <Last_name> Test_last</Last_name> <DOB>006/24/1280</DOB> <Addr1> Sleek street </Addr1> <City> cali </City> <State> MA </State> <Country> USA </Country> </customer_details> </Customers> </FormResponse> #DDL for the table (same as you) CREATE EXTERNAL TABLE default.customer_details ( Customers array<struct<customer_details:struct< First_name:string, Last_name:string, DOB:string, Addr1:string, City:string, State:string, Country:string>>> ) row format serde 'com.ibm.spss.hive.serde2.xml.XmlSerDe' with serdeproperties ( "column.xpath.Customers" = "/FormResponse/Customers/customer_details" ) STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' LOCATION 'hdfs://Admin-TrainingNS/user/hive/warehouse/customer_details' TBLPROPERTIES ( "xmlinput.start"="<FormResponse>", "xmlinput.end"="</FormResponse>" ); #Query to extract columns select inline(Customers.customer_details) from default.customer_details; OK Anji Raju 06/24/1278 14 duck st boston OH USA Jeet Anu 06/24/1279 tuttles groove denver CA USA Test1 Test_last 006/24/1280 Sleek street cali MA USA Time taken: 0.08 seconds, Fetched: 3 row(s)

Hope this helps! 🙂

avatar
Explorer

Hi @Vinicius Higa Murakami , Thanks for the reply.

I was trying to use inline function with SELECT statement. but its throwing an Error.

select strngname, inline(Customers.customer_details) from default.customer_details;

BUT its throwing below Error:

SemanticException [Error 10081]: UDTf's are not supported outside the SELECT clause , nor nested in expressions:
Could you please help me with this.

Thanks,

avatar

Hi @Anji Raju!
In this case, you'll need to use the LATERAL VIEW to use the inline function with another column.

hive> select c.*,'Vini' from default.customer_details lateral view inline(Customers.customer_details) c;
OK
Anji	Raju	06/24/1278	14 duck st	boston	OH	USA	Vini
Jeet	Anu	06/24/1279	tuttles groove	denver	CA	USA	Vini
Test1	Test_last	006/24/1280	Sleek street	cali	MA	USA	Vini
Time taken: 0.094 seconds, Fetched: 3 row(s)

Hope this helps!

avatar

Thanks, this really helps

Sundar Gampa