Support Questions
Find answers, ask questions, and share your expertise

How to parse Streaming Json object

Hi All,

I have an application which will stream the data continuously using Kafka-Spark- streaming.

But, the data is in Json format.Though this is not a complex Json Can some one suggest me how to parse these Streaming Json into CSV.

Thanks in advance

Regards,

Vijay

2 REPLIES 2

Expert Contributor

@Vijay Kumar J,

since you didn't mention what language you gonna use, I'll give an example in python.

import json
def load_json(txt):
    try:
        return json.loads(txt)
    except Exception:
        #print 'Message in topic is not JSON object! Got: ', txt
        return {}

kvs = KafkaUtils.createDirectStream(...........)
jsons_dstream = kvs.map(lambda x: load_json(x[1])).filter(lambda x: len(x)>0) #filter out non-json messages
csv_dstream = jsons_dstream .map(lambda j_msg: '%s,%s,%s'%(j_msg.get('k1'), j_msg.get('k2'), j_msg.get('k3')))
csv_dstream.pprint()

New Contributor

Hi @Ed Berezitsky

I have same use case. I am have to read Kinesis Stream using Spark streaming. The input data is JSON format. Can you please how to

1. parse JSON data and

2. display/write to Hbase

I am using SCALA based Spark streaming.

Thanks for your help.