- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Pick Column Based on Index Number
- Labels:
-
Apache NiFi
Created ‎03-01-2022 01:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have the Flow File which has Duplicate Column i want to pick the Column threw index Number,
is it possible to do with Query Record or any Processor
Note: Column Will change with every new Flow File coming
Created ‎03-02-2022 10:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's the flow template for those who have older nifi versions
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created on ‎03-01-2022 02:58 AM - edited ‎03-01-2022 03:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, @sachin_32 ,
I guess this is coming as a CSV file, right?
You can achieve what you want with the following approach:
- Configure your CSV Reader to ignore and skip the header line (if any)
- Configure your CSV Read to use the following schema:
{
"type": "record",
"name": "SensorReading",
"namespace": "com.cloudera.example",
"doc": "This is a sample sensor reading",
"fields": [
{ "name": "c1", "type": "string" },
{ "name": "c2", "type": "string" },
{ "name": "c3", "type": "string" }
]
}​
Ensure you use a schema with the exact number of columns that your input file has.
- In your QueryRecord you can then refer to the columns as c1, c2, etc...:
select c1, c2, c3
from flowfile​
Cheers,
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎03-01-2022 03:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your Suggestion but in this case i don't have any Exact Number of columns it will keep changing with incoming flow file it completely depends on the Flowfile And the scenario is i have few columns which can directly pick by giving the name of column but for some column which is coming more than one for that i need to setup like indexing and it's around 10-15 files which has this kind of issues so can you suggest for that ?
Created ‎03-01-2022 03:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The number of columns in the schema doesn't actually need to be exact if you're happy to ignore the ones after the last one specified in the schema.
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created on ‎03-01-2022 03:55 AM - edited ‎03-01-2022 03:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok
Created ‎03-02-2022 03:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here one different attempt. You can send your CSV flowfile to a ReplaceText processor with the following configuration:
The Search Value is the following regular expression:
(?s)^([^,\n]*),([^,\n]*),([^,\n]*),([^,\n]*),([^,\n]*)(.*$)
And the Replacement Value is:
$1,$2,$3,$4,col_a$6
Each capture group ([^,\n]*) will match the name of one column. If you want to keep the name of that column you just replace it with $x, where x is the position of the column.
If you want to replace the column with another name, e.g. col_a, you just type the name of the new column name in the replacement instead.
The last capture group (.*), will match the remaining of the first line. This way you don't need to match every single column, only the ones up to the position you want to replace.
As an example, for this input:
A,B,C,D,A
1,2,3,4,5
2,3,4,5,6
The above replacement will generate this output:
A,B,C,D,col_a
1,2,3,4,5
2,3,4,5,6
HTH,
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created on ‎03-02-2022 10:41 AM - edited ‎03-02-2022 10:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @araujo Thank you so much for your Help
Last Question if I have my attribute like :-
now what will be replacement value and in this case i have around 40 columns I want to rename only those which is present in my index Attribute and i want my column like
1,B,3,D,5---10,f,-- till 40
is there any way so that it don't depend on my all column name it's just replace the name as per the element in my INDEX attribute as it is and keep all columns without changing name??
Created ‎03-02-2022 04:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's another attempt at this (hopefully the last one 🙂 😞
I created the attached example that gets a flowfile and aattribute INDEX as you described above.
It then uses an UpdateAttribute to convert the INDEX attribute into a FILTER that we can use in the QueryRecord processor.
The QueryRecord process uses a fixed schema that has 100 columns. It's ok if your CSV has less columns. If the CSV can have more than 100 columns you need to update the schema to the maximum of columns you expect to receive in any CSV.
The output is a flowfile with the exact columns that were specified in the INDEX attribute.
Hope this helps.
Cheers,
Andre
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎03-02-2022 10:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's the flow template for those who have older nifi versions
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎03-02-2022 11:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the Help 🙂
