Support Questions

Find answers, ask questions, and share your expertise

Read CSV File with Header and Filter rows and then convert to JSON using Apache Nifi

avatar
Explorer

Hi,

     Im new to Apache Nifi and i'm looking on how to filter the CSV data using specific column.

 

     I'm able to convert to JOSN without filtering, but SplitText and RouteonAttribute processors are not helping to filter the data. Below is my input CSV.

 

Input:

number,name,resourceState,location,manufacturer
111897,lok,INSTALLED,HYD,ABC
115677,redd,RETIRED,BLR,ABC
1108448,eswar,PROP_INITIAL,CLT,ABC
1116740,wqwq,INITIAL,AA,ABC

 

Filtering should be based on resourceState column to consider INSTALLED and RETIRED data. So, converted JSON should have only 2 rows like below.

 

Expected JSON Output:

[{
"number": "111897",
"name": "lok",
"resourceState": "INSTALLED",
"location": "HYD",
"manufacturer": "ABC"
},
{
"number": "115677",
"name": "redd",
"resourceState": "RETIRED",
"location": "BLR",
"manufacturer": "ABC"
}]

 

Please help me on the CSV filtering part.

 

Thanks in Advance.

1 ACCEPTED SOLUTION

avatar
Contributor

Hello @Lokeswar 

The queryRecord processor does exactly what you want. But you need to have your job in a record-oriented approach, using json reader and json writter. Then you don't work with flow-file attribute, but directly with your flow-file content.

So, you have your CSV:

- use a convertRecord to transform it in record flow-file, using CSVreader as reader, and JSONTreeWriter as out writer as you want JSON

- add a queryRecord processor, and your query should look like this: SELECT * FROM FLOWFILE WHERE resourceState='INSTALLED' OR resourceState='RETIRED'

warning: don't use double quote for values, just simple quote

After this, you will have a condition at the output of the queryrecord that you can plugto your next processor.

View solution in original post

2 REPLIES 2

avatar
Contributor

Hello @Lokeswar 

The queryRecord processor does exactly what you want. But you need to have your job in a record-oriented approach, using json reader and json writter. Then you don't work with flow-file attribute, but directly with your flow-file content.

So, you have your CSV:

- use a convertRecord to transform it in record flow-file, using CSVreader as reader, and JSONTreeWriter as out writer as you want JSON

- add a queryRecord processor, and your query should look like this: SELECT * FROM FLOWFILE WHERE resourceState='INSTALLED' OR resourceState='RETIRED'

warning: don't use double quote for values, just simple quote

After this, you will have a condition at the output of the queryrecord that you can plugto your next processor.

avatar
Explorer

Thanks Stephane, queryRecord processor works as suggested.