Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Explode Array of Structs using get_json_object in hive

Highlighted

Explode Array of Structs using get_json_object in hive

Hi,

I have one column in hive table wherein I have stored entire json data map as string. I am using get_json_object to fetch each element of json. However I have one element which is array of structs. I need to explode that array of structs.

Event Sample:

{"evtDataMap":{"ucmEvt":{"rscDrvdStateEntMap":[{"prov_Id":"OCP","acct_Id":"ACCOUNT_0","chanlTypeId":"Chat","derivedAvlFlg":"TRUE","activeWrkCnt":"0","avlState":"READY","workLimit":"2"},{"providerId":"OCP Email","accountId":"ACCOUNT_0","channelTypeId":"Email","derivedAvailableFlag":"TRUE","activeWorkCount":"0","availabilityState":"READY","workLimit":"2"},{"providerId":"OCP ShortMessageService","accountId":"ACCOUNT_0","channelTypeId":"ShortMessageService","derivedAvailableFlag":"TRUE","activeWorkCount":"0","availabilityState":"READY","workLimit":"2"}]}}}

Entire evtDataMap is stored in a hive column and I want the output like

prov_id acct_Id chanlTypeId derivedAvlFlg activeWrkCnt avlState workLimit
OCP ACCOUNT_0 Chat TRUE 0 READY 2
OCP Email ACCOUNT_0 Email TRUE 0 READY 2
OCP ShortMessageService ACCOUNT_0 hortMessageServic TRUE 0 READY 2

Basically I want to flatten the array of structs.

I am using query like this

with temp as (
select evName,get_json_object(evtDataMap,'$.ucmEvt.rscDrvdStateEntMap') as mapp from avaya.jmsrec_temp where evtName ='USER')
Select evtName, a.prov_id,a.acct_Id,a.chanlTypeId ,a.derivedAvlFlg,a.activeWrkCnt,a.avlState,a.workLimit from temp lateral view inline(temp.mapp) a as prov_id,acct_Id,chanlTypeId ,derivedAvlFlg,activeWrkCnt,avlState,workLimit 

eventName is a hive table column.

I am getting an error as "FAILED: UDFArgumentException Top level object must be an array but was string"

Tried array over get_json_object but still I am getting an error

array(get_json_object(eventDataMap,'$.ucmEvent.user.resourceDerivedStateEntryMap')) as mapp

Error

FAILED: UDFArgumentException The sub element must be struct, but was string

I am stuck with this issue since 3 days, any help is much much appreciated..

3 REPLIES 3
Highlighted

Re: Explode Array of Structs using get_json_object in hive

Super Guru
@Anish Gupta

Could you create the Hive table using JsonSerde and then apply explode array as per this SO thread.

Highlighted

Re: Explode Array of Structs using get_json_object in hive

@Shu Hi Shu, Thanks for reply !!

No its actually not possible for me to store this JSON in separate hive table using json serde. Actually there are 360 different type of events with underlying schema changes and I need to process them dynamically on fly. So what I have done is stored the datamap(which is one of the key in entire json event) into a hive column table and then I am using get_json_object to fetch the json elements as and when required. But the datamap(stored as a string column) itself has nested complex structure so I want to unnest that.

Highlighted

Re: Explode Array of Structs using get_json_object in hive

New Contributor

Hello @DataGeek_Anish ,

 

Did you find the solution of this? I also stuck into the same problem and could not able to solve it.

I have JSON array data in HIVE tables. The data type of the column is ARRAY.

Could anyone please help me to resolve this issue or let me know the other solution to flatten the JSON array data would be really appreciate.

 

Data:

[{"ts":1403781896,"id":14,"log":"show"},{"ts":1403781896,"id":14,"log":"start"}]
[{"ts":1403781911,"id":14,"log":"press"},{"ts":1403781911,"id":14,"log":"press"}]

Logic Used:

SELECT 
get_json_object(single_json_table.single_json, '$.ts') AS ts,get_json_object(single_json_table.single_json, '$.id') AS id,get_json_object(single_json_table.single_json, '$.log') AS log
FROM(SELECT explode(json_array_col) as single_json FROM jt) single_json_table;

Error:

Error while compiling statement: FAILED: SemanticException [Error 10014]: line 2:0 Wrong arguments ''$.td'': No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson with (map<string,string>, string). Possible choices: FUNC(string, string)

 

PS: I can unfold the array with the explode function but can't able to flatten the JSON array data into columns.

 

Thank You.

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here