Support Questions

Find answers, ask questions, and share your expertise

Array of JSON records to Avro with Nifi

avatar
Explorer

I have a JSON file containing an array of records and I'm trying to convert the entire JSON to Avro file using the ConvertJSONToAvro processor.

JSON sample:

[
  {
    "id": 123,
    "title": "foo"
  },
  {
    "id": 345,
    "title": "bar"
  }
]

Avro Schema:

{
  "name": "test",
  "type": "array",
  "items": {
    "type": "record",
    "name": "user",
    "fields": [
      {
        "name": "id",
        "type": "int"
      },
      {
        "name": "title",
        "type": "string"
      }
    ]
  }
}

However using the above Avro schema the processor throws an exception:

java.lang.IllegalArgumentException: Schemas for JSON files should be record
	at org.kitesdk.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) ~[kite-data-core-1.0.0.jar:na]
	at org.kitesdk.data.spi.filesystem.JSONFileReader.initialize(JSONFileReader.java:84) ~[kite-data-core-1.0.0.jar:na]
	at org.apache.nifi.processors.kite.ConvertJSONToAvro$1.process(ConvertJSONToAvro.java:144) ~[nifi-kite-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2578) ~[nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.processors.kite.ConvertJSONToAvro.onTrigger(ConvertJSONToAvro.java:139) ~[nifi-kite-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]


Does this mean the ConvertJsonToAvro processor cannot convert an array of records, and I have to split the JSON file before feeding the records to this process? It seems like it doesn't recognise at "type": "array" at the root of the schema.

5 REPLIES 5

avatar
Master Guru

I think if you change the schema type to record it will work...It will should take each entry in the JSON array, and write it as a record in the Avro data file.

avatar
Explorer

I tried that but it doesn't process the flowfile. It also gives the following warning:

ConvertJSONToAvro[id=8aec4759-eae0-1dab-ffff-ffff9b831c59] Failed to convert 1/1 records from JSON to Avro

avatar
Master Guru

Hi @Alireza Sadeghi,
I tried with the same input array that you mentioned above and i'm able to convert json to avro.

Steps:

  1. ConvertJSONToAvro expects one record at a time, you should need to use SplitJSONProcessor before feeding records to ConvertJSONToAvro processor.
  2. In SplitJSON you need to keep JSONPath Expression should be $.* it will split your incoming json array into individual records.
  3. Then your Record Schema in ConvertJSONToAvro processor :
{
	"type": "record",
	"name": "abc",
	"fields": [{
		"name": "id",
		"type": ["null","int"]
	},
	{
		"name": "title",
		"type": ["null","string"]
	}]
}

Screenshot of the flow:-

jsontoavro.png

Splitjson config:-

splitjson.png

ConvertJsontoAvro Config:-

convertjsontoavro.png


avatar
Super Collaborator

Hi @Alireza Sadeghi ,

did you able to solve the issue.?? if so how.??

i am running in to the same issue , it is working if i use SplitJSON--> ConvertJSONToAvro processers .

but ruuning in to the same issue as you when i directly use CovertJSONToAvro with Record Schema property set.

avatar
Master Guru

@Saikrishna Tarapareddy,
if you are using NiFi 1.2+ then you can try with ConvertRecord processor instead of ConvertJSONToAvro processor, as ConvertRecord processor takes array of json and converts as avro.

Keep convertrecord processor configs record reader as Json reader and writer as avro set writer.