Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Solved Go to solution

NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

java -jar avro-tools-1.8.2.jar getschema test.avro log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.io.IOException: Not a data file. at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105) at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97) at org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47) at org.apache.avro.tool.Main.run(Main.java:87) at org.apache.avro.tool.Main.main(Main.java:76)

No errors until I try to convert that file to ORC or if I download it and look at it in avro tools.

1 Jungnickel Rd Ganado���&2017-08-04 15:26:54MovementBFAK@WE�=@�"ƖX�&2017-08-04 15:06:561243142403�TX&2017-08-04 14:15:31 77962%

1 ACCEPTED SOLUTION

Accepted Solutions

Re: NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

Can you share the configuration of AvroRecordSetWriter? That file doesn't look like it has a schema embedded in it (you can usually see the schema as JSON near the beginning of the file contents). You may need to configure the writer to embed the schema for use by ConvertAvroToORC or avro-tools (if you don't separately provide the schema to the latter).

4 REPLIES 4

Re: NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

Can you share the configuration of AvroRecordSetWriter? That file doesn't look like it has a schema embedded in it (you can usually see the schema as JSON near the beginning of the file contents). You may need to configure the writer to embed the schema for use by ConvertAvroToORC or avro-tools (if you don't separately provide the schema to the latter).

Re: NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

The files are really small too as seen above and don't seem complete.

I have done that conversion before with no issues.

I am wondering if this is related to having a bunch of null fields.

27421-convertrecord.png

27422-jsontreereader.png

27423-avrorecordsetwriter.png

27424-schema1.png

Re: NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

I think the issue is with the HWX Content-Encoded Schema Reference, this is a special "header" in an avro file which makes it easy to integrate with HWX Schema Registry serializers and deserializers, but likely precludes it from being understood by Apache Avro readers such as the one in ConvertAvroToORC or avro-tools. If you can, try setting the Schema Write Strategy to Embed Avro Schema; this will result in larger flow files but should work in downstream processors. If/when there is a OrcRecordSetWriter, you should be able to reuse the HWX schema reference option there.

Re: NiFI ConvertRecord : AvroRecordSetWriter Producing Invalid Avro

Super Guru

+1 for an OrcRecordSetWriter

Don't have an account?
Coming from Hortonworks? Activate your account here