Support Questions

Find answers, ask questions, and share your expertise

how to replace value in file

avatar
Explorer

There have some rows data with header in txt file, like this:

 

test_a|test_b|test_c|test_d|test_e
a|b|3.0|4.0|5.0
a|b|3.0|4.0|5.0

a|b|3.0|4.0|5.0

 

now , i want  remove the value after test_c and test_d  decimal point,  the result change this:

 

test_a|test_b|test_c|test_d|test_e
a|b|3|4|5.0
a|b|3|4|5.0

a|b|3|4|5.0

 

how could i do? thanks.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hello,

 

Most likely because on your CSV Reader you have:

 

Treat First Line as Header = false ( default )

Change that to true

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

@zhangliang to accomplish that i would use UpdateRecord

Since your data is csv and structured we can use record manipulation to accomplish this.

First I would treat all your values as string and build an avro schema to use:

 

{
	"type":"record",
	"name":"nifiRecord",
	"namespace":"org.apache.nifi",
		"fields":[
			{"name":"test_a","type":["null","string"]},
			{"name":"test_b","type":["null","string"]},
			{"name":"test_c","type":["null","string"]},
			{"name":"test_d","type":["null","string"]},
			{"name":"test_e","type":["null","string"]}
		]
}

 

Then I would configure my UpdateRecord to use a CSV Reader and a CSV Writer

 

I would configure the CSV Reader like this:

image.png

Use schema text property

Schema Text = Put your avro schema there

Value Separator = |

 

And the CSV Writer leave everything default except:

 

Value Separator = |

 

 

Finally the UpdateRecord processor will need 2 user fields.

In this case we want to update the fields "test_c" and "test_d"

And then we can use Record path manipulation and in particular for this use case the substringBefore function to only give us everything before the DOT "."

Here is what you should configure:

image.png

 

This will then take an input like this:

test_a|test_b|test_c|test_d|test_e
a|b|3.0|4.0|5.0
a|b|3.0|4.0|5.0
a|b|3.0|4.0|5.0

 

and produce an output like this:

 

test_a|test_b|test_c|test_d|test_e
a|b|3|4|5.0
a|b|3|4|5.0
a|b|3|4|5.0

avatar
Explorer

thank you advice , I use you design, but i get the result like this, it has two row header:

 

test_a|test_b|test_c|test_d|test_e
test_a|test_b|test_c|test_d|test_e
a|b|3|4|5.0
a|b|3|4|5.0
a|b|3|4|5.0

 

hou could i do, thanks

avatar
Expert Contributor

Hello,

 

Most likely because on your CSV Reader you have:

 

Treat First Line as Header = false ( default )

Change that to true