Created 05-16-2023 07:42 AM
I am storing several JSON Schemas as Parameter Contexts, and I need to access the schemas dynamically to validate flowfiles. But which schema depends on the flowfile, so I want to access a particular schema in the Parameter Contexts dynamically using an attribute value. How do I do that?
Here is what I've tried:
#{${schema.name}}
${schema.name:evaluateELString()}
The second technique I am using creates the exact string that I need #{Blue_JSON_Schema}, and then I try to perform an eval on it. But unfortunately it doesn't work either.
Hopefully someone knows how to do this. Thanks for taking the time to read this.
Created 05-18-2023 02:37 AM
I am having the same problem with version 1.20.0. I am using the advanced properties from "UpdateAttribute" to set up the attribute name to:
s3_bucket
#{s3_my_bucket}
s3_buc: ${s3_bucket:evaluateELString()}
And is not working, the value that I am getting is:
s3_buc
Empty string set
This functionality used to work in prev. versions.
Thanks
Created 05-30-2023 02:03 PM
@ChuckE
I would not expect this to work. The "evaluateELString()" NiFi Expression Language (NEL) function triggers the evaluation of any NEL statements found in the subject passed to that function. An NEL statement always starts with "${"; however, what you have is "#{" since you have a parameter context reference in your subject. So the expected output would be the literal subject in your case.
Somewhere in your dataflow you are assigning the literal parameter name to a FlowFile Attribute. Why not evaluate the parameter rather then assign it as a literal string in an FlowFile attribute? Perhaps some more context/details around your use case would help here?
Thanks,
Matt
Created 05-30-2023 09:46 PM
The concept is EXACTLY the same concept as using the AvroSchemaRegistry, which holds a bunch of schemas, and as each flowfile gets read by a single processor that uses a NiFi Recordset Reader/Writer service (e.g. ConvertRecord), they will individually reference their own schema. Some are Red data, some are Blue data, and some are Purple data, and they each dynamically reference their corresponding Avro schema. In fact, you can specify the "schema.name" attribute, which references said schema in the AvroSchemaRegistry.
The premise of my inquiry is that I want to achieve something similar using JSON Schemas despite the fact there is no JSON Schema Registry. So I want to use Parameter Context parameters to hold the schemas, then dynamically refer to them in a similar manner as we can do with Avro schemas.
Created 05-31-2023 09:47 AM
@ChuckE
Understood. What i am curious about is where in your dataflow do you implement the logic to determine which schema needs to be used?
So you have some FlowFile with json content in your dataflow.
- You then need to determine which schema needs to go with this specific FlowFile's content.
- Then you want to return that schema text from a parameter context.
So how do you make that determination of what schema goes with which FlowFile? Simply by where data was consumed from?
Have you considered using the "Advanced" UI of the updateAttribute to create rules based on how you make your determinations to add a new FlowFIle Attribute with the extracted schema from the parameter context(s)?
Thanks,
Matt
Created 05-31-2023 05:52 PM
Exactly, the advanced feature in UpdateAtrributes is how we tag all in the incoming data. Flowfile comes in and the schema.name attribute gets set. Then later down the line we do some validation on the data as it gets read into a NiFi Recordset object in conjunction with the AvroSchemaRegistry.
However, we would like to switch to using JSON Schema, but the ValidateJSON processor doesn't support attribute references so the "JSON Schema" parameter needs to either the whole JSON Schema itself, or a hard-coded Parameter Context parameter reference.
What this means is that we need to use a separate ValidateJson processor for each JSON Schema (i.e. 20 schemas = 20 processors). Ugh!
This is a very significant shortcoming of the ValidateJson processor and makes it unusable except for simple flows with homogenous data. Hopefully there are plans to expand out its features.
I attached a couple of screenshots to help illustrate the problem.
Created 06-01-2023 07:42 AM
@ChuckE
Thank you for the details as they are very helpful.
The ValidateJson processor was a new community contribution to Apache NiFi in version 1.19.
https://issues.apache.org/jira/browse/NIFI-7392
It does not appear the processor supports dynamic Json Schema values a runtime. It requires exactly one resource.
I don't know if this was because of the "Schema Version" property aspect, where supporting dynamic Json could cause issues since each Json Schema may use different schema versions.
I'd encourage you to create an Apache NiFi jira enhancement request for this component.
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.
Thank you,
Matt
Created 06-01-2023 10:33 AM