- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Error while creating hive table on top of json data
- Labels:
-
Apache Hive
Created 05-01-2018 04:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
I am trying to create a hive table on top of son data file but getting the below error.Below are my json file,hive ddl and error.
Json file::::::::
{ "purchaseid": { "ticketnumber": "23546852222", "location": "vizag", "Travelerhistory": { "trav": { "fname": "ramu", "lname": "gogi", "travelingarea": { "destination": { "stationid": "KAJKL", "stationname": "hyd" } }, "food": { "foodpref": [{ "foodcode": "CK567", "foodcodeSegment": "NOVEG" }, { "foodcode": "MM98", "foodcodeSegment": "VEG" } ] } } } } }
Hive DDL::
add jar /home/**********/json-serde-1.3.7-jar-with-dependencies.jar; CREATE external TABLE ds1414( ticketnumber string, location string, Travelerhistory ARRAY<struct<trav :struct<fname:string,lname:string, travelingarea :ARRAY<struct<destination :struct<stationid:string,stationname:string>>>, food :ARRAY<struct<foodpref :struct<foodcode:string,foodcodeSegment:string>>> >>> ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/***********/json2/' ;
Error:::
select * from ds1414;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream@205df5dc; line: 1, column: 0]) at [Source: java.io.ByteArrayInputStream@205df5dc; line: 1, column: 3] Time taken: 0.212 seconds
Created 05-01-2018 05:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try with the below create table statement
hive> Create table ds1414( purchaseid struct<ticketnumber:string,location:string, Travelerhistory:struct< trav:struct<fname:string,lname:string, travelingarea:struct< destination :struct<stationid:string,stationname:string>>, food :struct<foodpref:array<struct<foodcode:string,foodcodeSegment:string>>> > > >) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/***********/json2/';
Examples of Accessing data from hive table as follows
hive> select * from ds1414; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ | ds1414.purchaseid | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+ | {"ticketnumber":"23546852222","location":"vizag","travelerhistory":{"trav":{"fname":"ramu","lname":"gogi","travelingarea":{"destination":{"stationid":"KAJKL","stationname":"hyd"}},"food":{"foodpref":[{"foodcode":"CK567","foodcodesegment":"NOVEG"},{"foodcode":"MM98","foodcodesegment":"VEG"}]}}}} | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
hive> select purchaseid.ticketnumber from ds1414; +---------------+--+ | ticketnumber | +---------------+--+ | 23546852222 | +---------------+--+
hive> select ds1414.purchaseid.ticketnumber,purchaseid.location from ds1414; +---------------+-----------+--+ | ticketnumber | location | +---------------+-----------+--+ | 23546852222 | vizag | +---------------+-----------+--+
hive> select purchaseid.Travelerhistory.trav.food.foodpref from ds1414; +-----------------------------------------------------------------------------------------------+--+ | foodpref | +-----------------------------------------------------------------------------------------------+--+ | [{"foodcode":"CK567","foodcodesegment":"NOVEG"},{"foodcode":"MM98","foodcodesegment":"VEG"}] | +-----------------------------------------------------------------------------------------------+--+
hive> select purchaseid.Travelerhistory.trav.food.foodpref.foodcode from ds1414; +-------------------+--+ | foodcode | +-------------------+--+ | ["CK567","MM98"] | +-------------------+--+
hive> select purchaseid.Travelerhistory.trav.food.foodpref.foodcode[0] from ds1414; +--------+--+ | _c0 | +--------+--+ | CK567 | +--------+--+
Let us know if you are having any questions..
Created 05-01-2018 09:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you make sure you are having valid json message in "/user/***********/json2/" this path and keep all the json message in one line(no new line)
(or)
just create a new file in your local and then put that file into your hadoop directory
bash$ vi sample_json.txt //create a new file with the below sample message in it.<br>{ "purchaseid": { "ticketnumber": "23546852222", "location": "vizag", "Travelerhistory": { "trav": { "fname": "ramu", "lname": "gogi", "travelingarea": { "destination": { "stationid": "KAJKL", "stationname": "hyd" } }, "food": { "foodpref": [{ "foodcode": "CK567", "foodcodeSegment": "NOVEG" }, { "foodcode": "MM98", "foodcodeSegment": "VEG" } ] } } } } }
bash$ hadoop fs -rm -r /user/***********/json2/ //for testing purposes we are deleting all the files in the dirbash$ hadoop fs -put -f sample_json.txt /user/***********/json2/ //put the newly created file into the hdfs dir
-
if you are still having issue attach screenshot of your input data in /user/***********/json2/ this path.
Created 05-07-2018 01:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please use the below DDL
Create EXTERNAL TABLE SRGMSBI1417.json14_07_05_01( purchaseid struct <ticid:string,ticlocation:string,custnum :string, Travleinfo:struct <Trav:struct <fname:string,mname:string, freq:struct <fre:struct <frequencynumber:string,frequnecystatus:string>>, food :struct <foodpref:array<struct<foodcode:string,foodcodeSegment:string>>>, Seats :struct<Seat:array<struct<seatberth:string,loc:string>>>, stations :struct<station:array<struct<sationname:string,trainnum:string>>>>>, Comments :struct<Comment:array<struct<commentno:string,desc:string,passengerseat :struct<intele :string>,passengerloc :struct<intele :string>>>>>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
location '/user/srgmsbi1417/json14_07_05_01';
i have tested some columns
hive> select purchaseid.ticid from default.json limit 10; +--------+--+ | ticid | +--------+--+ | 1496 | +--------+--+
hive> select purchaseid.Comments.comment.commentno from default.json limit 10; +-----------------+--+ | commentno | +-----------------+--+ | ["1","5","12"] | +-----------------+--+
hive> select purchaseid.Comments.comment[0].commentno from default.json limit 10; +------------+--+ | commentno | +------------+--+ | 1 | +------------+--+
**tip: use some sort of json formatter while creating ddl for so that we can understand easily where to end the struct or array.
-
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created 05-07-2018 06:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
use this ddl
Create EXTERNAL TABLE SRGMSBI1417.json14_07_05_01( purchaseid struct <ticid:string,ticlocation:string,custnum :string, Travleinfo:struct <Trav:struct <fname:string,lname:string,mname:string, freq:struct <fre:struct <frequencynumber:string,frequnecystatus:string>>, food :struct <foodpref:array<struct<foodcode:string,foodcodeSegment:string>>>, Seats :struct<Seat:array<struct<seatberth:string,loc:string>>>, stations :struct<station:array<struct<sationname:string,trainnum:string>>>>>, Comments :struct<Comment:array<struct<commentno:string,desc:string,passengerseat :struct<intele :string>,passengerloc :struct<intele :string>>>>>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/srgmsbi1417/json14_07_05_01';
And don't keep array type for fre because it's struct not the array, that's the reason why you are getting exceptions.
Created 05-01-2018 05:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 05-07-2018 11:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Shu
Thanks for the help.
i have json file as below format , i am creating ddl which works well,but when i do a select query it throwing error.
json file:::
{"purchaseid": {"ticid": "1496","ticlocation": "vizag","custnum": "222","Travleinfo": {"Trav": {"fname": "ramu","mname": "g","freq": {"fre": {"frequencynumber": "9","frequnecystatus": "na"}},"food": {"foodpref": [{"foodcode": "9","foodcodeSegment": "chic"},{"foodcode": "22","foodcodeSegment": "veg"},{"foodcode": "36","foodcodeSegment": "idl"}] },"Seats": { "Seat": [{"seatberth": "upper","loc": "s15"},{"seatberth": "lower","loc": "s215"},{"seatberth": "upper","loc": "s3"}] },"stations": { "station": [{"sationname": "vizag","trainnum": "c197"},{"sationname": "hyd","trainnum": "hyd187"},{"sationname": "wrgl","trainnum": "wr1822"}]}}},"Comments": {"comment": [{"commentno": "1","desc": "journey","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" }},{"commentno": "5","desc": " food","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" }},{"commentno": "12","desc": " service","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" } }]}}}
DDL:::
Create EXTERNAL TABLE SRGMSBI1417.json14_07_05_01( purchaseid struct<ticid:string,ticlocation:string,custnum :string, Travleinfo:struct< trav:struct<fname:string,lname:string,mname:string, freq :struct<fre:array<struct<frequencynumber:string,frequnecystatus:string>>>, food :struct<foodpref:array<struct<foodcode:string,foodcodeSegment:string>>>, Seats :struct<Seat:array<struct<seatberth:string,loc:string>>>, stations :struct<station:array<struct<sationname:string,trainnum:string>>> >>, Comments :struct<Comment:array<struct<commentno:string,desc:string,passengerseat :struct<intele :string>,passengerloc :struct<intele :string> >>> > ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/srgmsbi1417/json14_07_05_01';
error:::
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array
Created 05-07-2018 03:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Shu
Thanks for the help.
i have json file as below format , i am creating ddl which works well,but when i do a select query it throwing error.
json file:::
{"purchaseid": {"ticid": "1496","ticlocation": "vizag","custnum": "222","Travleinfo": {"Trav": {"fname": "ramu","mname": "g","freq": {"fre": {"frequencynumber": "9","frequnecystatus": "na"}},"food": {"foodpref": [{"foodcode": "9","foodcodeSegment": "chic"},{"foodcode": "22","foodcodeSegment": "veg"},{"foodcode": "36","foodcodeSegment": "idl"}] },"Seats": { "Seat": [{"seatberth": "upper","loc": "s15"},{"seatberth": "lower","loc": "s215"},{"seatberth": "upper","loc": "s3"}] },"stations": { "station": [{"sationname": "vizag","trainnum": "c197"},{"sationname": "hyd","trainnum": "hyd187"},{"sationname": "wrgl","trainnum": "wr1822"}]}}},"Comments": {"comment": [{"commentno": "1","desc": "journey","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" }},{"commentno": "5","desc": " food","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" }},{"commentno": "12","desc": " service","passengerseat": { "intele": "09" },"passengerloc": { "intele": "s15" } }]}}}
DDL:::
Create EXTERNAL TABLE SRGMSBI1417.json14_07_05_01( purchaseid struct<ticid:string,ticlocation:string,custnum :string, Travleinfo:struct< trav:struct<fname:string,lname:string,mname:string, freq :struct<fre:array<struct<frequencynumber:string,frequnecystatus:string>>>, food :struct<foodpref:array<struct<foodcode:string,foodcodeSegment:string>>>, Seats :struct<Seat:array<struct<seatberth:string,loc:string>>>, stations :struct<station:array<struct<sationname:string,trainnum:string>>> >>, Comments :struct<Comment:array<struct<commentno:string,desc:string,passengerseat :struct<intele :string>,passengerloc :struct<intele :string> >>> > ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/srgmsbi1417/json14_07_05_01';
error:::
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array
Created 10-23-2018 12:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i have this type of data anyone plzz help me to define the schema in hive
{
"md5": "bcgjcjdshhdfhkfd",
"dat": "15/08/2000",
"data": '[{"job_id":"589d0147aac2ed8665ce94d6","environment_id":100,"environment_description":"Windows732bit","size":109056,"type":"PE32executable(GUI)Intel80386, forMSWindows","type_short":["peexe"],"target_url":null,"state":"SUCCESS","error_type":null,"error_origin":null,"submit_name":"svchost.exe","md5":"7657fcb7d772448a6d8504e4b20168b8","sha1":"84c7201f7e59cb416280fd69a2e7f2e349ec8242","sha256":"54bc950d46a0d1aa72048a17c8275743209e6c17bdacfc4cb9601c9ce3ec9a71","sha512":"786addd2a793bd4123625b22dc717d193246442ac97f1c3f4a763ec794b48e68051cd41097c0e9f7367e6914534f36eafccb109ab03dc793d68bf1522e7884e2","ssdeep":"1536: xfZZnAEjEIZvumULmj4wrraK5dZ4Ltta9Km\/ec3DtAL6bmZ4bXSjrAE+fySPoqRr: xnnAQVG\/LytaKItS\/fiLKS+f5Aq7is","imphash":"093a51e0b7dcb2466b7edfd78d191aa0","av_detect":91,"vx_family":"Gen: Variant.Kazy","url_analysis":false,"analysis_start_time":"2017-02-10T00: 54: 57-06: 00","threat_score":100,"interesting":false,"threat_level":2,"verdict":"malicious","certificates":[],"domains":["stromoliks.com","promoliks.com"],"classification_tags":[],"compromised_hosts":[],"hosts":[],"total_network_connections":0,"total_processes":2,"total_signatures":37,"extracted_files":[],"processes":[],"file_metadata":null,"tags":[],"mitre_attcks":[]}]'
}