Support Questions

Find answers, ask questions, and share your expertise

Parse file in Nifi ?

avatar
Rising Star

Hi,

I am getting error while parsing CSV formatted JSON file in NiFi. My file file column like...

Name : Surendra

Age : 24

Address : {"city":"Chennai","state":"TN","zipcode":"600345"}

Now output should be like this..

Name : Surendra

Age : 24

Address_city : Chennai

Address_state : TN

Address_zipcode : 600345

Pls can anyone help me regarding the same.

1 ACCEPTED SOLUTION

avatar
Master Guru
@Surendra Shringi

We can do this parsing inside NiFi by using

Example:-

Let's consider your csv file having n number of rows in it

Surendra,24,"{"city":"Chennai","state":"TN","zipcode":"600345"}"
Surendra,25,"{"city":"Chennai","state":"TN","zipcode":"609345"}"

We need to split this file into individual flowfile having each record in one flowfile for splitting we need to use

SplitText:-

processor with below configs as

Line Split Count

1

49399-splittext.png

So if our input csv having 2 lines in it then split text processor will split the input file having 2 lines into 2 flowfiles having each line in one flowfile.

Once we are having each record in one flowfile then we need to use

ExtractText:-

to extract the content of the flowfile using Extract text processor by adding new properties to the processor as below.

Address_city

"city":"(.*?)"

Address_state

"state":"(.*?)"

Address_zipcode

"zipcode":"(.*?)"

Age

,(.*?),

Name

^(.*?),

49400-extracttext.png

So in this processor we are going to extract contents of flowfile and keep them as flowfile attributes by adding matching regex.

To create and test regex click here.

You need to change Maximum Buffer Size value (default is 1MB) based on your flowfile size.

Replace Text Configs:-

In the previous step we have extracted all the contents of flowfile based on the properties in Replace Text processor we are going to create a new csv file with comma delimiter(you can use any delimiter you want), By changing below properties and adding replacement value property as follows.

Configs:-

Search Value

(?s)(^.*$)

Replacement Value

${Name},${Age},${Address_city},${Address_state},${Address_zipcode}

Maximum Buffer Size

1 MB

Replacement Strategy

Always Replace

Evaluation Mode

Entire text

49398-replacetext.png

So the output of the replace text processor would be

Surendra,24,Chennai,TN,24
Surendra,25,Chennai,TN,24

we have created a csv file without json message now but we are going to have 2 csv files(because our input data having 2 lines),if your input file having 1000 lines then we are going to end up with 1000 ourput csv files.

If you don't want to create 2 output files and want them to merge into 1 output file then you need to use

Merge Content Processor:-

With the below configs,

49401-mergecontent.png

You need to change all the highlighted properties as per your requirements as per my configs shows Max bin age of 1 min so processor waits for 1 minute before merging all the queued flowfiles and merges them into 1 file.

Delimiter strategy to Text(default is filename) because we need to have our contents of individual flowfile needs to add as newlines in the merged file, so we need to make use of Demarcator property as Shift+Enter(this property helps to add new contents to the newline).

Output:-

1 file having both records in it

Surendra,24,Chennai,TN,600345
Surendra,25,Chennai,TN,609345

I highly sugges you to refer below links to get familiar with all properties in merge content processor

https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.ht...

https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.ht...

.

I'm attaching the xml to the post you can save the xml and import to nifi and make changes to that accordingly.parse-file-nifi-159780.xml

.

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of errors.

View solution in original post

5 REPLIES 5

avatar
Rising Star

48405-screenshot-from-2018-01-12-100503.png

Hi @Shu my input is like this, so now i want to parse these data according to above which i mentioned.

Thanks !

avatar
Rising Star

I want to fetch this data from Mysql so i created a table name as input in Mysql.

And my flow like this ExecuteSQL ->> SplitAvro ->> ConvertAvroToJson ->> EvaluateJsonPath ->> UpdateAttribute

avatar
Rising Star

Thanks @shu for your reply ... and I am looking for the same output which you send me. I want the output in CSV file like Surendra,24,Chennai,TN,24. And output will stored in local machine only.

avatar
Master Guru
@Surendra Shringi

We can do this parsing inside NiFi by using

Example:-

Let's consider your csv file having n number of rows in it

Surendra,24,"{"city":"Chennai","state":"TN","zipcode":"600345"}"
Surendra,25,"{"city":"Chennai","state":"TN","zipcode":"609345"}"

We need to split this file into individual flowfile having each record in one flowfile for splitting we need to use

SplitText:-

processor with below configs as

Line Split Count

1

49399-splittext.png

So if our input csv having 2 lines in it then split text processor will split the input file having 2 lines into 2 flowfiles having each line in one flowfile.

Once we are having each record in one flowfile then we need to use

ExtractText:-

to extract the content of the flowfile using Extract text processor by adding new properties to the processor as below.

Address_city

"city":"(.*?)"

Address_state

"state":"(.*?)"

Address_zipcode

"zipcode":"(.*?)"

Age

,(.*?),

Name

^(.*?),

49400-extracttext.png

So in this processor we are going to extract contents of flowfile and keep them as flowfile attributes by adding matching regex.

To create and test regex click here.

You need to change Maximum Buffer Size value (default is 1MB) based on your flowfile size.

Replace Text Configs:-

In the previous step we have extracted all the contents of flowfile based on the properties in Replace Text processor we are going to create a new csv file with comma delimiter(you can use any delimiter you want), By changing below properties and adding replacement value property as follows.

Configs:-

Search Value

(?s)(^.*$)

Replacement Value

${Name},${Age},${Address_city},${Address_state},${Address_zipcode}

Maximum Buffer Size

1 MB

Replacement Strategy

Always Replace

Evaluation Mode

Entire text

49398-replacetext.png

So the output of the replace text processor would be

Surendra,24,Chennai,TN,24
Surendra,25,Chennai,TN,24

we have created a csv file without json message now but we are going to have 2 csv files(because our input data having 2 lines),if your input file having 1000 lines then we are going to end up with 1000 ourput csv files.

If you don't want to create 2 output files and want them to merge into 1 output file then you need to use

Merge Content Processor:-

With the below configs,

49401-mergecontent.png

You need to change all the highlighted properties as per your requirements as per my configs shows Max bin age of 1 min so processor waits for 1 minute before merging all the queued flowfiles and merges them into 1 file.

Delimiter strategy to Text(default is filename) because we need to have our contents of individual flowfile needs to add as newlines in the merged file, so we need to make use of Demarcator property as Shift+Enter(this property helps to add new contents to the newline).

Output:-

1 file having both records in it

Surendra,24,Chennai,TN,600345
Surendra,25,Chennai,TN,609345

I highly sugges you to refer below links to get familiar with all properties in merge content processor

https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.ht...

https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.ht...

.

I'm attaching the xml to the post you can save the xml and import to nifi and make changes to that accordingly.parse-file-nifi-159780.xml

.

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of errors.

avatar
Rising Star

Thanks for your overwhelming response, this will help me a great.