Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Array of JSON records to Avro with Nifi

Array of JSON records to Avro with Nifi

New Contributor

I have a JSON file containing an array of records and I'm trying to convert the entire JSON to Avro file using the ConvertJSONToAvro processor.

JSON sample:

[
  {
    "id": 123,
    "title": "foo"
  },
  {
    "id": 345,
    "title": "bar"
  }
]

Avro Schema:

{
  "name": "test",
  "type": "array",
  "items": {
    "type": "record",
    "name": "user",
    "fields": [
      {
        "name": "id",
        "type": "int"
      },
      {
        "name": "title",
        "type": "string"
      }
    ]
  }
}

However using the above Avro schema the processor throws an exception:

java.lang.IllegalArgumentException: Schemas for JSON files should be record
	at org.kitesdk.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) ~[kite-data-core-1.0.0.jar:na]
	at org.kitesdk.data.spi.filesystem.JSONFileReader.initialize(JSONFileReader.java:84) ~[kite-data-core-1.0.0.jar:na]
	at org.apache.nifi.processors.kite.ConvertJSONToAvro$1.process(ConvertJSONToAvro.java:144) ~[nifi-kite-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2578) ~[nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.processors.kite.ConvertJSONToAvro.onTrigger(ConvertJSONToAvro.java:139) ~[nifi-kite-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
	at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]


Does this mean the ConvertJsonToAvro processor cannot convert an array of records, and I have to split the JSON file before feeding the records to this process? It seems like it doesn't recognise at "type": "array" at the root of the schema.

5 REPLIES 5

Re: Array of JSON records to Avro with Nifi

I think if you change the schema type to record it will work...It will should take each entry in the JSON array, and write it as a record in the Avro data file.

Highlighted

Re: Array of JSON records to Avro with Nifi

New Contributor

I tried that but it doesn't process the flowfile. It also gives the following warning:

ConvertJSONToAvro[id=8aec4759-eae0-1dab-ffff-ffff9b831c59] Failed to convert 1/1 records from JSON to Avro

Re: Array of JSON records to Avro with Nifi

Super Guru

Hi @Alireza Sadeghi,
I tried with the same input array that you mentioned above and i'm able to convert json to avro.

Steps:

  1. ConvertJSONToAvro expects one record at a time, you should need to use SplitJSONProcessor before feeding records to ConvertJSONToAvro processor.
  2. In SplitJSON you need to keep JSONPath Expression should be $.* it will split your incoming json array into individual records.
  3. Then your Record Schema in ConvertJSONToAvro processor :
{
	"type": "record",
	"name": "abc",
	"fields": [{
		"name": "id",
		"type": ["null","int"]
	},
	{
		"name": "title",
		"type": ["null","string"]
	}]
}

Screenshot of the flow:-

jsontoavro.png

Splitjson config:-

splitjson.png

ConvertJsontoAvro Config:-

convertjsontoavro.png


Re: Array of JSON records to Avro with Nifi

Super Collaborator

Hi @Alireza Sadeghi ,

did you able to solve the issue.?? if so how.??

i am running in to the same issue , it is working if i use SplitJSON--> ConvertJSONToAvro processers .

but ruuning in to the same issue as you when i directly use CovertJSONToAvro with Record Schema property set.

Re: Array of JSON records to Avro with Nifi

Super Guru

@Saikrishna Tarapareddy,
if you are using NiFi 1.2+ then you can try with ConvertRecord processor instead of ConvertJSONToAvro processor, as ConvertRecord processor takes array of json and converts as avro.

Keep convertrecord processor configs record reader as Json reader and writer as avro set writer.

Don't have an account?
Coming from Hortonworks? Activate your account here