Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to create avro schema for nested json

New Contributor

I want to add avro schema for nested json which I am trying to read from kafka

example json :

{
"payload":{
"before":null,
"after":{
"id":1
},
"source":{
"version":"0.8.1.Final",
"name":"dev",
"server_id":0,
"ts_sec":0,
"gtid":null,
"file":"mysql-bin-changelog.022566",
"pos":154,
"row":0,
"snapshot":true,
"thread":null,
"db":"dev_datalake_poc",
"table":"dummy",
"query":null
},
"op":"c",
"ts_ms":1534239705758
}
}

Supposingly if I want to fetch

payload.after.id and payload.op

I have tried and it results in null values ("payload" :null) :

{
"type" : "record",
"name" : "dummyschema",
"fields" : [ {
"name" : "payload",
"type" : {
"type" : "record",
"name" : "payload",
"fields" : [ {
"name" : "before",
"type" : [ "string", "null" ]
}, {
"name" : "after",
"type" : {
"type" : "record",
"name" : "after",
"fields" : [ {
"name" : "id",
"type" : "long"
} ]
}
}, {
"name" : "source",
"type" : {
"type" : "record",
"name" : "source",
"fields" : [ {
"name" : "version",
"type" : "string"
}, {
"name" : "name",
"type" : "string"
}, {
"name" : "server_id",
"type" : "long"
}, {
"name" : "ts_sec",
"type" : "long"
}, {
"name" : "gtid",
"type" : [ "string", "null" ]
}, {
"name" : "file",
"type" : "string"
}, {
"name" : "pos",
"type" : "long"
}, {
"name" : "row",
"type" : "long"
}, {
"name" : "snapshot",
"type" : "boolean"
}, {
"name" : "thread",
"type" : [ "string", "null" ]
}, {
"name" : "db",
"type" : "string"
}, {
"name" : "table",
"type" : "string"
}, {
"name" : "query",
"type" : [ "string", "null" ]
} ]
}
}, {
"name" : "op",
"type" : "string"
}, {
"name" : "ts_ms",
"type" : "long"
} ]
}
} ]
}

4 REPLIES 4

@Dhruvil Dhankani

I think you are looking for the processor EvaluateJsonPath where you can define payload.after.id and payload.op as follows when sending json content file from kafka:

payload.after.id: $.payload.after.id
payload.op: $.payload.op

Applying an avro schema to the json and using record readers is another beast so let us know if that is what you are looking for.

If this answer is helpful, please choose accept to mark it as answered.

New Contributor

@ Steven Matison

I need a help in creating a Avro Schema for Nested JSON :

My Example JSON in GenerateFlowFile :

{ "ser" :

{ "measureSeries" :

{ "teNo" : "TE",

"testStart" : "EE",

"testEnd" : "EE" }

}

}

My Avro Nested Schema Validated in : https://json-schema-validator.herokuapp.com/avro.jsp

{ "name":"devices", "type":"record", "fields":

[ { "name": "ser", "type":

{"type":"array","items":{ "name":"meSer","type":"record","fields":

[ {"name":"teNo","type":["string","null"]},

{"name":"testStart","type":["string","null"]},

{"name":"testEnd","type":["string","null" ]} ]

} }

} ]

}

I am using CovertRecord Processor :

Record Reader : JsonTreeReader

Record Writter : RecordSetWriter.

It is throwing an error "could not parse incoming data". Any Input on soving nested schema will help me..

Besides from making it easy to work with the data files, it also means you can be certain that a file which conforms to a schema, fed to a parser that knows how to validate files according to a schema, will never error out in unpredictable ways. If the file does not conform, you reject it right away, rather than partially utilizing the file's contents and then erring out halfway through, which can be much more dangerous.

New Contributor

How did you create AVRO schema from nested JSON?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.