Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi - Split a record using a non-root JSON attribute

avatar
New Contributor

I have JSON input of the following format:

{
  "Id": 1000000,
  "ReportName": TestReport,
  "Results": [{
    "Id": 1,
    "ResultId": "1000000-0",
    "Query": {
      "Id": 001,
      "Name": "TestQuery0",
    }
  }, {
    "Id": 2,
    "ResultId": "1000000-1",
    "Query": {
      "Id": 002,
      "Name": "TestQuery1",
    }
  }]
}

These file can become quite large depending on the number of Results in the Report and I was hoping to convert the single Flowfile to multiple records for processing. However, due to the format of the JSON a SplitRecord will result in one record per split. There is one report per FlowFile and therefore only 1 root level element.
I am looking for a method or strategy to split the Flowfile into smaller Records while still maintaining the cohesiveness of the report in the end when it put in HDFS.

Current Strategy:

  1. Use JoltTransformJSON to inject report information into each result
  2. Use SplitRecord to split the Flowfile on each result
  3. Process Records
  4. Use MergeRecord to get the modified Flowfile in Step1
  5. Convert to original Flowfile (Not sure of the best method here)
  6. Use MergeContent and push to HDFS

Any advice would be appreciated! Thank you

1 ACCEPTED SOLUTION

avatar
Master Guru

If I am reading your use case correctly, I think you're looking for what the ForkRecord processor does; it allows you to fork a (usually single) record into multiple records based on a Record Path (similar to JSONPath but different syntax and expressiveness), possibly keeping the "root" elements common to each outgoing record.

View solution in original post

2 REPLIES 2

avatar
Master Guru

If I am reading your use case correctly, I think you're looking for what the ForkRecord processor does; it allows you to fork a (usually single) record into multiple records based on a Record Path (similar to JSONPath but different syntax and expressiveness), possibly keeping the "root" elements common to each outgoing record.

avatar
New Contributor

Thank you for the response. This was the correct answer, but I was unable to verify until recently.