Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

JSON Schema for dynamic key field in Spark Structured Streaming?

Highlighted

JSON Schema for dynamic key field in Spark Structured Streaming?

Contributor

I receive JSON data from kafka with from_json() method. It expects schema from me. My JSON structure like this;

 

{     "Items": {         "key1": [             {                 "id": "",                 "name": "",                 "val": ""             }         ],         "key2": [             {                 "id": "",                 "name": "",                 "val": ""             }         ],         "key3": [             {                 "id": "",                 "name": "",                 "val": ""             }         ]     }
}

 

Key1, Key2, Key3 are dynamic. So, they may be changed. For example, another json is;

{     "Items": {         "hortoworks": [             {                 "id": "",                 "name": "",                 "val": ""             }         ],         "community": [             {                 "id": "",                 "name": "",                 "val": ""             }         ],         "question": [             {                 "id": "",                 "name": "",                 "val": ""             }         ]     }
}


These key names are unknown. But "id-name-val" fields inside these keys are the same.

I must define a json schema for read data from Kafka in Spark Structured Streaming. How can I do this?

1 REPLY 1
Highlighted

Re: JSON Schema for dynamic key field in Spark Structured Streaming?

New Contributor

Hi @sosyalmedya_ogu ,

 

Did you get any formidable workaround for this?


I have ran into similar use-case where the JSON might have a change in schema.The producer application for our Kafka listens to an external API endpoint so we do not have control over the schema. Therefore, I am looking for the solution to handle dynamic JSON schema while processing this in Structured Streaming.

 

Any help would be highly appreciated.

 

Thanks,

Kumar Rohit

 

Don't have an account?
Coming from Hortonworks? Activate your account here