Created on 08-16-2018 08:47 PM - edited 09-16-2022 06:35 AM
I want to add avro schema for nested json which I am trying to read from kafka
example json :
{
"payload":{
"before":null,
"after":{
"id":1
},
"source":{
"version":"0.8.1.Final",
"name":"dev",
"server_id":0,
"ts_sec":0,
"gtid":null,
"file":"mysql-bin-changelog.022566",
"pos":154,
"row":0,
"snapshot":true,
"thread":null,
"db":"dev_datalake_poc",
"table":"dummy",
"query":null
},
"op":"c",
"ts_ms":1534239705758
}
}
Supposingly if I want to fetch
payload.after.id and payload.op
I have tried and it results in null values ("payload" :null) :
{
"type" : "record",
"name" : "dummyschema",
"fields" : [ {
"name" : "payload",
"type" : {
"type" : "record",
"name" : "payload",
"fields" : [ {
"name" : "before",
"type" : [ "string", "null" ]
}, {
"name" : "after",
"type" : {
"type" : "record",
"name" : "after",
"fields" : [ {
"name" : "id",
"type" : "long"
} ]
}
}, {
"name" : "source",
"type" : {
"type" : "record",
"name" : "source",
"fields" : [ {
"name" : "version",
"type" : "string"
}, {
"name" : "name",
"type" : "string"
}, {
"name" : "server_id",
"type" : "long"
}, {
"name" : "ts_sec",
"type" : "long"
}, {
"name" : "gtid",
"type" : [ "string", "null" ]
}, {
"name" : "file",
"type" : "string"
}, {
"name" : "pos",
"type" : "long"
}, {
"name" : "row",
"type" : "long"
}, {
"name" : "snapshot",
"type" : "boolean"
}, {
"name" : "thread",
"type" : [ "string", "null" ]
}, {
"name" : "db",
"type" : "string"
}, {
"name" : "table",
"type" : "string"
}, {
"name" : "query",
"type" : [ "string", "null" ]
} ]
}
}, {
"name" : "op",
"type" : "string"
}, {
"name" : "ts_ms",
"type" : "long"
} ]
}
} ]
}
Created 08-17-2018 11:59 AM
I think you are looking for the processor EvaluateJsonPath where you can define payload.after.id and payload.op as follows when sending json content file from kafka:
payload.after.id: $.payload.after.id
payload.op: $.payload.op
Applying an avro schema to the json and using record readers is another beast so let us know if that is what you are looking for.
If this answer is helpful, please choose accept to mark it as answered.
Created 11-26-2018 06:20 PM
@ Steven Matison
I need a help in creating a Avro Schema for Nested JSON :
My Example JSON in GenerateFlowFile :
{ "ser" :
{ "measureSeries" :
{ "teNo" : "TE",
"testStart" : "EE",
"testEnd" : "EE" }
}
}
My Avro Nested Schema Validated in : https://json-schema-validator.herokuapp.com/avro.jsp
{ "name":"devices", "type":"record", "fields":
[ { "name": "ser", "type":
{"type":"array","items":{ "name":"meSer","type":"record","fields":
[ {"name":"teNo","type":["string","null"]},
{"name":"testStart","type":["string","null"]},
{"name":"testEnd","type":["string","null" ]} ]
} }
} ]
}
I am using CovertRecord Processor :
Record Reader : JsonTreeReader
Record Writter : RecordSetWriter.
It is throwing an error "could not parse incoming data". Any Input on soving nested schema will help me..
Created 12-03-2018 03:26 PM
Besides from making it easy to work with the data files, it also means you can be certain that a file which conforms to a schema, fed to a parser that knows how to validate files according to a schema, will never error out in unpredictable ways. If the file does not conform, you reject it right away, rather than partially utilizing the file's contents and then erring out halfway through, which can be much more dangerous.
Created 04-16-2019 01:44 PM
How did you create AVRO schema from nested JSON?