Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Apache Nifi - How to calculate SUM or AVERAGE of values in a JSON array

avatar
Expert Contributor

Hi,

I have a Json message ''store' which contains an array of 'books'. I want to calculate sum/average of all book prices.

Is there a way to do it in Nifi? I explored JsonPath and JOLT, however so far I haven't found a way to do it. Thanks.

Input:

{ "store": {
    "books": [ 
      { "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      { "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      { "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      { "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ]
  }
}

Output:

Sum of all prices : 53.92

5 REPLIES 5

avatar

Hi @Obaid Salikeen,

My suggestion would be to use ExecuteScript processor with your scripting language of choice (Groovy, Python, Ruby, Scala, Javascript, and Lua are all supported). With Groovy, for example, this would be approximately 4 lines -- use a JsonSlurper to parse the JSON and extract the value(s) you are interested in, then use any combination of collect, sum, average, etc. to perform the desired mathematical operation, and return this in the OutputStream.

@Matt Burgess has some good examples of using ExecuteScript with JSON arrays here.

avatar
Master Guru

I'm working on a processor to do this kind of thing: https://issues.apache.org/jira/browse/NIFI-2735

avatar

If you prefer not to follow this path, you could connect together an EvaluateJsonPath processor and an UpdateAttribute processor and use the Expression Language mathematical operators to calculate these values as well.

avatar
Contributor

Hello @Andy LoPresto,

I am trying to compute sum and average using EvaluateJsonPath, UpdateAttribute and Expression Language mathematical operators. I can do simple mathematical operations on attributes from a single flowfile but I am not sure how to access the previous flowfiles which are necessary to compute any kind of running total or average. Any suggestions on how this can be done without writing a custom processor?

Thanks

Sorry I misinterpreted the question here. I need to compute average across multiple flowfiles. I need to compute a running total of a particular attribute across flowfiles.

avatar
Master Guru

If a single flow file contains an array and you want to manipulate values within, then @Andy LoPresto's solution is recommended. From your comment on his answer it appears you want to compute the average across multiple flow files. From a flow perspective, how would you know when you were "done" calculating the average? Will you have a running average that is calculated from sum-so-far and count-so-far? Or do you want to take X flow files in, calculate the average, then output the X flow files (or perhaps a single one) with the average for those X flow files?

NiFi 1.2.0 (having implemented NIFI-1582) will include the ability to store and calculate state using UpdateAttribute. This can be used to maintain "sum" and "count" attributes, which at any given point would let you calculate the running average. In the meantime (or alternatively), you could use ExecuteScript or InvokeScriptedProcessor to perform this same function. It would be similar to Andy's approach, but would also store the sum-so-far and count-so-far into the processor's State Map. If you are calculating a running average and want to output each flow file as it comes in (adding a "current average" attribute for example), you can use ExecuteScript. If you want to keep the incoming flow files until a total average can be calculated, then you'd need InvokeScriptedProcessor.