Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Spark SQL JSON array querry ?

avatar
Explorer

I have a simple JSON dataset as below. How do I query all parts.lock

JSON:

{
    "id": 1,
    "name": "A green door",
    "price": 12.50,
    "tags": ["home", "green"],
    "parts" : [
        {
            "lock" : "One lock",
            "key" : "single key"
        },
        {
            "lock" : "2 lock",
            "key" : "2 key"
        }
    ]
}

Query:

select id,name,price,parts.lock from product

The point is if I use parts[0].lock it will return one row as below:

{u'price': 12.5, u'id': 1, u'.lock': {u'lock': u'One lock', u'key': u'single key'}, u'name': u'A green door'}

But I want to return all the locks in the parts structure.

Please help me with this, thanks !

1 ACCEPTED SOLUTION

avatar
Contributor
3 REPLIES 3

avatar
Contributor

avatar
Explorer

Thanks for your reply chrisf,

 

I've been trying to use LATERAL VIEW explode for week but still can't figure how to use it, can you give me an example from my first post.

I also try json-serde in HiveContext, i can parse table, but can't querry although the querry work fine in Hive. 

EX:

 

+ In both Hive anh HiveContext, i can parse table: 

CREATE EXTERNAL TABLE data( parts array<struct<locks:STRING, key:STRING>> ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/user/hue/...';

 

+ Then in Hive, i can use this: 

SELECT parts.locks FROM data;

 

but it will return error in HiveContext.

 

Looking forward for reply, thanks !

avatar
Explorer

Oh, I finally do it, follow is my hql:

SELECT id, part.lock, part.key FROM mytable EXTERNAL VIEW explode(parts) parttable AS part;

 

many thanks chrisf !