Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

Apache NiFi 1.0.0 has a lot of cool features, but no JSON-to-CSV converter yet.

For my use case, I wanted to split one JSON file that had 5,000 records into 5,000 CSVs. You could also through a MergeContent processor in there and make one file. My data came from the excellent test data source, RandomUser. They provide a free API to pull down data, https://api.randomuser.me/?results=5000&format=pretty. I chose pretty to get formatted JSON with multiple lines which is easier to parse.

This is an example of JSON returned from that service:

{"results":[
{"gender":"male",
"name":{"title":"monsieur","first":"lohan","last":"marchand"},
"location":{"street":"6684 rue jean-baldassini","city":"auboranges","state":"schwyz","postcode":9591},
"email":"lohan.marchand@example.com",
"login":{"username":"biggoose202","password":"esther","salt":"QIU1HBsr","md5":"9e60da6d4490cd6d102e8010ac98f283","sha1":"3de3ea419da1afe5c83518f8b46f157895266d17","sha256":"c6750c1a5bd18cac01c63d9e58a57d75520861733666ddb7ea6e767a7460479b"},
"dob":"1965-01-28 03:56:58",
"registered":"2014-07-26 11:06:46",
"phone":"(849)-890-5523",
"cell":"(395)-127-9369",
"id":{"name":"AVS","value":"756.OUVK.GFAB.51"},
"picture":{"large":"https://randomuser.me/api/portraits/men/69.jpg","medium":"https://randomuser.me/api/portraits/med/men/69.jpg","thumbnail":"https://randomuser.me/api/portraits/thumb/men/69.jpg"},
"nat":"CH"}
]

Step 1: GetFile: Read the JSON file from a directory (I can process any JSON file in there)

Step 2: SplitJSON: Split on $.results to split array of JSON.

8982-splitjson.png

Step 3: EvaluateJSONPath: Pull out attributes from the JSON record sent. Example: $.cell

8981-pullkeyattributes.png

Step 4: UpdateAttribute: Update the filename to be unique ${md5}.csv.

8940-updateattribute.png

Step 5: ReplaceText: To format JSON attributes into a line of command-separated values.

8983-replacetextcsv.png

Step 6: PutFile: Store the resulting file in a directory. (This could also be PutHDFS or many other sink processors).

8939-flow.png

11,914 Views
Comments
Rising Star

@Timothy Spann I have a similar use case, however my flow fails at the replace text processor. Can you give some advice on the configuration of the processor?

InvokeHTTP --> EvaluateJsonPath --> ReplaceText --> MergeContent --> UpdateAttribute --> PutHDFS

My flow does several HTTP calls with InvokeHTTP (Each call with different ID), extracts attributes from each JSON that is returned (each JSON is unique) and then creates the csv's like in your example. However after the MergeContent processor the merged CSV there is really a lot of duplicate data while all incoming JSONs contain unique data.

ReplaceText conf:

45400-repltext.png

MergeContent conf:

45401-capture.png

Super Guru

can you post logs and the error?

the better way now is to get the JSON schema (you can use InferAvroSchema if you dont) then just do splitJSON to ConvertRecord

no manual coding

I have lots of new articles on this one.

With the record-aware processors like ConvertRecord, you don't need SplitJSON either, it can work on a whole JSON array

New Contributor

Hi @Timothy Spann

I want to parse CSV JSON format file to CSV using Nifi.

My CSV table containing data like

name : surendra,

age : 25

address : {city:chennai,state:TN,zipcode:600234}

Now i want to take the output in below format :::

Name : surendra

Age : 25

Address_city : chennai, Address_state : TN, Address_zipcode : 600234

Can we do like this pls let me know.

Super Guru

now use record processor

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 08:39 AM
Updated by:
 
Contributors
Top Kudoed Authors