Created on 10-28-2016 08:45 PM - edited 08-17-2019 08:39 AM
Apache NiFi 1.0.0 has a lot of cool features, but no JSON-to-CSV converter yet.
For my use case, I wanted to split one JSON file that had 5,000 records into 5,000 CSVs. You could also through a MergeContent processor in there and make one file. My data came from the excellent test data source, RandomUser. They provide a free API to pull down data, https://api.randomuser.me/?results=5000&format=pretty. I chose pretty to get formatted JSON with multiple lines which is easier to parse.
This is an example of JSON returned from that service:
{"results":[ {"gender":"male", "name":{"title":"monsieur","first":"lohan","last":"marchand"}, "location":{"street":"6684 rue jean-baldassini","city":"auboranges","state":"schwyz","postcode":9591}, "email":"lohan.marchand@example.com", "login":{"username":"biggoose202","password":"esther","salt":"QIU1HBsr","md5":"9e60da6d4490cd6d102e8010ac98f283","sha1":"3de3ea419da1afe5c83518f8b46f157895266d17","sha256":"c6750c1a5bd18cac01c63d9e58a57d75520861733666ddb7ea6e767a7460479b"}, "dob":"1965-01-28 03:56:58", "registered":"2014-07-26 11:06:46", "phone":"(849)-890-5523", "cell":"(395)-127-9369", "id":{"name":"AVS","value":"756.OUVK.GFAB.51"}, "picture":{"large":"https://randomuser.me/api/portraits/men/69.jpg","medium":"https://randomuser.me/api/portraits/med/men/69.jpg","thumbnail":"https://randomuser.me/api/portraits/thumb/men/69.jpg"}, "nat":"CH"} ]
Step 1: GetFile: Read the JSON file from a directory (I can process any JSON file in there)
Step 2: SplitJSON: Split on $.results to split array of JSON.
Step 3: EvaluateJSONPath: Pull out attributes from the JSON record sent. Example: $.cell
Step 4: UpdateAttribute: Update the filename to be unique ${md5}.csv.
Step 5: ReplaceText: To format JSON attributes into a line of command-separated values.
Step 6: PutFile: Store the resulting file in a directory. (This could also be PutHDFS or many other sink processors).
Created on 12-13-2017 10:36 AM - edited 08-17-2019 08:38 AM
@Timothy Spann I have a similar use case, however my flow fails at the replace text processor. Can you give some advice on the configuration of the processor?
InvokeHTTP --> EvaluateJsonPath --> ReplaceText --> MergeContent --> UpdateAttribute --> PutHDFS
My flow does several HTTP calls with InvokeHTTP (Each call with different ID), extracts attributes from each JSON that is returned (each JSON is unique) and then creates the csv's like in your example. However after the MergeContent processor the merged CSV there is really a lot of duplicate data while all incoming JSONs contain unique data.
ReplaceText conf:
MergeContent conf:
Created on 01-04-2018 08:42 PM
can you post logs and the error?
the better way now is to get the JSON schema (you can use InferAvroSchema if you dont) then just do splitJSON to ConvertRecord
no manual coding
I have lots of new articles on this one.
Created on 01-05-2018 06:09 PM
With the record-aware processors like ConvertRecord, you don't need SplitJSON either, it can work on a whole JSON array
Created on 01-10-2018 10:25 AM
Hi @Timothy Spann
I want to parse CSV JSON format file to CSV using Nifi.
My CSV table containing data like
name : surendra,
age : 25
address : {city:chennai,state:TN,zipcode:600234}
Now i want to take the output in below format :::
Name : surendra
Age : 25
Address_city : chennai, Address_state : TN, Address_zipcode : 600234
Can we do like this pls let me know.
Created on 03-16-2018 03:15 PM
now use record processor