Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Pattern: convert CSV to JSON?

avatar

Hi, what's the recommended processor sequence to parse single-line csv entries into JSON? I'm all set on ingest and egress, but a little fuzzy on the conversion part still.

1 ACCEPTED SOLUTION

avatar
Master Guru

If you have a fixed set of columns in the csv that you know ahead of time, you can use ExtractText + ReplaceText. This was an example I created once before, rename to .xml: csvtojson.txt

View solution in original post

8 REPLIES 8

avatar

How about - ConvertCSVToAvro - & then - ConvertAvroToJSON ?

Alternatively, You can use ExtractText with regex to convert CSV to JSON.

avatar
Master Guru

If you have a fixed set of columns in the csv that you know ahead of time, you can use ExtractText + ReplaceText. This was an example I created once before, rename to .xml: csvtojson.txt

avatar

Yes, it's a known CSV with a header line. I wonder if there's a trick to use the column names and expression language to avoid manual re-typing.

avatar
Master Guru

There might be a cleaner way to do this, but if you had an incoming CSV like:

h1,h2,h3,h4

v1,v2,v3,v4

You could capture that in ExtractText with a pattern of:

(.+),(.+),(.+),(.+)\n(.+),(.+),(.+),(.+)

Then in ReplaceText:

{ "${csv.1}" : "${csv.5}", "${csv.2}" : "${csv.6}", "${csv.3}" : "${csv.7}", "${csv.4}" : "${csv.8}" }

Would produce:

{ "h1" : "v1", "h2" : "v2", "h3" : "v3", "h4" : "v4" }

avatar
Guru

I would use ExtractText with Regex and then AttributestoJSON processor to create a JSON formated flow file.

avatar
Explorer

I have a convertCSVtoJSON processor. Will see about getting it contributed back.

avatar
Explorer

any word on your contribution ? I can use a csv2json processor/sequence now !!! 🙂

avatar
Master Guru

In Apache NiFi 1.2.0 and 1.3.0 (HDF 3.0.0) there is a ConvertRecord processor that can convert between any combination of Avro, JSON, and CSV.