Created 08-29-2017 12:12 AM
I have a text file I'm reading into a Nifi flow, which consists of key value pairs that look like the following:
status:"400" body_bytes_sent:"174" referer:"google.com" user_agent:"safari" host:"8.8.4.4" query_string:"devices"
status:"400" body_bytes_sent:"172" referer:"yahoo.com" user_agent:"Chrome" host:"8.8.4.3" query_string:"books"
Currently the tailfile processor is successfully reading these files as they are created and append to. However, I want to output them as avro files to Kafka. Any idea what processor(s) I need to convert these text files into avro format in my flow? What would the configuration look like for these processors?
Created 08-29-2017 10:16 PM
Hi @Ed Prout,
if you are having text file consists of key value pairs,i have used pipe(|) as delimiter in ReplaceText processor for replacement value, you can use any delimiter as you like. But the delimiter needs to match with InferAvroSchema csv header definition
i have taken a few key value pairs as input.
Input:-
status:"400" body_bytes_sent:"174" referer:"google.com"
Output after convertCSVToAvro Processor:-
Obj...avro.schema..{
"type": "record",
"name": "sample",
"doc": "Schema generated by Kite",
"fields": [{
"name": "status",
"type": "long",
"doc": "Type inferred from '400'"
},
{
"name": "body_bytes_sent",
"type": "long",
"doc": "Type inferred from '174'"
},
{
"name": "reference",
"type": "string",
"doc": "Type inferred from 'google.com'"
}]
}.avro.codec.snappy...y...Zf....N*...*.8.....google.com*.e...y...Zf....N*
Created 08-29-2017 10:16 PM
Hi @Ed Prout,
if you are having text file consists of key value pairs,i have used pipe(|) as delimiter in ReplaceText processor for replacement value, you can use any delimiter as you like. But the delimiter needs to match with InferAvroSchema csv header definition
i have taken a few key value pairs as input.
Input:-
status:"400" body_bytes_sent:"174" referer:"google.com"
Output after convertCSVToAvro Processor:-
Obj...avro.schema..{
"type": "record",
"name": "sample",
"doc": "Schema generated by Kite",
"fields": [{
"name": "status",
"type": "long",
"doc": "Type inferred from '400'"
},
{
"name": "body_bytes_sent",
"type": "long",
"doc": "Type inferred from '174'"
},
{
"name": "reference",
"type": "string",
"doc": "Type inferred from 'google.com'"
}]
}.avro.codec.snappy...y...Zf....N*...*.8.....google.com*.e...y...Zf....N*
Created 09-06-2017 07:59 PM
This worked nicely. Thanks Yash!