Created 08-29-2017 12:12 AM
I have a text file I'm reading into a Nifi flow, which consists of key value pairs that look like the following:
status:"400" body_bytes_sent:"174" referer:"google.com" user_agent:"safari" host:"8.8.4.4" query_string:"devices"
status:"400" body_bytes_sent:"172" referer:"yahoo.com" user_agent:"Chrome" host:"8.8.4.3" query_string:"books"
Currently the tailfile processor is successfully reading these files as they are created and append to. However, I want to output them as avro files to Kafka. Any idea what processor(s) I need to convert these text files into avro format in my flow? What would the configuration look like for these processors?
Created 08-29-2017 10:16 PM
Hi @Ed Prout,
if you are having text file consists of key value pairs,i have used pipe(|) as delimiter in ReplaceText processor for replacement value, you can use any delimiter as you like. But the delimiter needs to match with InferAvroSchema csv header definition
i have taken a few key value pairs as input. Input:- status:"400" body_bytes_sent:"174" referer:"google.com" Output after convertCSVToAvro Processor:- Obj...avro.schema..{ "type": "record", "name": "sample", "doc": "Schema generated by Kite", "fields": [{ "name": "status", "type": "long", "doc": "Type inferred from '400'" }, { "name": "body_bytes_sent", "type": "long", "doc": "Type inferred from '174'" }, { "name": "reference", "type": "string", "doc": "Type inferred from 'google.com'" }] }.avro.codec.snappy...y...Zf....N*...*.8.....google.com*.e...y...Zf....N*
Created 08-29-2017 10:16 PM
Hi @Ed Prout,
if you are having text file consists of key value pairs,i have used pipe(|) as delimiter in ReplaceText processor for replacement value, you can use any delimiter as you like. But the delimiter needs to match with InferAvroSchema csv header definition
i have taken a few key value pairs as input. Input:- status:"400" body_bytes_sent:"174" referer:"google.com" Output after convertCSVToAvro Processor:- Obj...avro.schema..{ "type": "record", "name": "sample", "doc": "Schema generated by Kite", "fields": [{ "name": "status", "type": "long", "doc": "Type inferred from '400'" }, { "name": "body_bytes_sent", "type": "long", "doc": "Type inferred from '174'" }, { "name": "reference", "type": "string", "doc": "Type inferred from 'google.com'" }] }.avro.codec.snappy...y...Zf....N*...*.8.....google.com*.e...y...Zf....N*
Created 09-06-2017 07:59 PM
This worked nicely. Thanks Yash!