Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

InferAvroSchema + GetAvroMetadata

Contributor

Hello,

I'm using the processor inferAvroSchema to infer the schema table from a csv. I want to use the GetAvroMetadata processor to extract the record name from the results of inferAvroSchema but I don't know how to configure the processor.

I know that I can use a groovy code to extract the record name but I think GetAvroMetadata is used for this purpose too.

capture.png

Can someone help me plz.

Thank you

1 REPLY 1

Super Guru

InferAvroSchema requires you to enter the record name yourself, if you need access to that record name later, you could set a variable on the process group and use that in Expression Language for the Record Name, then you'd have access to that same variable everywhere in the process group and wouldn't have to extract it.

If you don't have access to the value injected into the schema by InferAvroSchema, and the schema is in an attribute (let's say "inferred.avro.schema"), then you can use the jsonPath() function in NiFi Expression Language to extract the record name into a separate attribute. You'd need an UpdateAttribute to set "record.name" to the following:

${inferred.avro.schema:jsonPath("$.name")}

If your schema is in the content of the flow file, then since it is a JSON object you can use EvaluateJSONPath to get the record name into an attribute, using the following JSONPath expression:

$.name

ExtractAvroMetadata is for Avro files, so if your flow file contained Avro (with an embedded schema), you could use ExtractAvroMetadata (adding "avro.schema" to the list of metadata to extract) in order to get the schema. But this processor doesn't work for CSV files.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.