Support Questions

Find answers, ask questions, and share your expertise

How to create avro schema for nested json

avatar
New Contributor

I want to add avro schema for nested json which I am trying to read from kafka

example json :

{
"payload":{
"before":null,
"after":{
"id":1
},
"source":{
"version":"0.8.1.Final",
"name":"dev",
"server_id":0,
"ts_sec":0,
"gtid":null,
"file":"mysql-bin-changelog.022566",
"pos":154,
"row":0,
"snapshot":true,
"thread":null,
"db":"dev_datalake_poc",
"table":"dummy",
"query":null
},
"op":"c",
"ts_ms":1534239705758
}
}

Supposingly if I want to fetch

payload.after.id and payload.op

I have tried and it results in null values ("payload" :null) :

{
"type" : "record",
"name" : "dummyschema",
"fields" : [ {
"name" : "payload",
"type" : {
"type" : "record",
"name" : "payload",
"fields" : [ {
"name" : "before",
"type" : [ "string", "null" ]
}, {
"name" : "after",
"type" : {
"type" : "record",
"name" : "after",
"fields" : [ {
"name" : "id",
"type" : "long"
} ]
}
}, {
"name" : "source",
"type" : {
"type" : "record",
"name" : "source",
"fields" : [ {
"name" : "version",
"type" : "string"
}, {
"name" : "name",
"type" : "string"
}, {
"name" : "server_id",
"type" : "long"
}, {
"name" : "ts_sec",
"type" : "long"
}, {
"name" : "gtid",
"type" : [ "string", "null" ]
}, {
"name" : "file",
"type" : "string"
}, {
"name" : "pos",
"type" : "long"
}, {
"name" : "row",
"type" : "long"
}, {
"name" : "snapshot",
"type" : "boolean"
}, {
"name" : "thread",
"type" : [ "string", "null" ]
}, {
"name" : "db",
"type" : "string"
}, {
"name" : "table",
"type" : "string"
}, {
"name" : "query",
"type" : [ "string", "null" ]
} ]
}
}, {
"name" : "op",
"type" : "string"
}, {
"name" : "ts_ms",
"type" : "long"
} ]
}
} ]
}

4 REPLIES 4

avatar
Super Guru
@Dhruvil Dhankani

I think you are looking for the processor EvaluateJsonPath where you can define payload.after.id and payload.op as follows when sending json content file from kafka:

payload.after.id: $.payload.after.id
payload.op: $.payload.op

Applying an avro schema to the json and using record readers is another beast so let us know if that is what you are looking for.

If this answer is helpful, please choose accept to mark it as answered.

avatar
New Contributor

@ Steven Matison

I need a help in creating a Avro Schema for Nested JSON :

My Example JSON in GenerateFlowFile :

{ "ser" :

{ "measureSeries" :

{ "teNo" : "TE",

"testStart" : "EE",

"testEnd" : "EE" }

}

}

My Avro Nested Schema Validated in : https://json-schema-validator.herokuapp.com/avro.jsp

{ "name":"devices", "type":"record", "fields":

[ { "name": "ser", "type":

{"type":"array","items":{ "name":"meSer","type":"record","fields":

[ {"name":"teNo","type":["string","null"]},

{"name":"testStart","type":["string","null"]},

{"name":"testEnd","type":["string","null" ]} ]

} }

} ]

}

I am using CovertRecord Processor :

Record Reader : JsonTreeReader

Record Writter : RecordSetWriter.

It is throwing an error "could not parse incoming data". Any Input on soving nested schema will help me..

avatar

Besides from making it easy to work with the data files, it also means you can be certain that a file which conforms to a schema, fed to a parser that knows how to validate files according to a schema, will never error out in unpredictable ways. If the file does not conform, you reject it right away, rather than partially utilizing the file's contents and then erring out halfway through, which can be much more dangerous.

avatar
New Contributor

How did you create AVRO schema from nested JSON?