Created on 11-22-2023 01:05 AM - edited 11-22-2023 01:06 AM
I receive a text/csv file with many lines through an InvokeHttp Processor. My requirement is that I don't want the first 7 lines. What should I do to remove the first 7 lines and keep the remaining the same text/csv format.
Created 11-22-2023 09:23 AM
Hi @glad1 ,
Can you elaborate more on the data that you want to remove? For example if the data is part of the CSV and it has unique value in one or more columns, then you can use QueryRecord processor where the query exclude records with this unique value. If the data is out of the CSV - like a header information - then depending how this data look like and if its surrounded with some special characters then you can use ReplaceText Processor with regex that would isolate those lines and then replace them with empty space and so on. If you can provide some sample data it would help in figuring out the best solution for this scenario.
Thanks
Created 11-22-2023 09:52 PM
^ I've attached the image above. this is how the data looks. I want to clean the first 7 rows and let the 8th row (header row) be first.
Created 11-22-2023 11:33 AM
I you're confident the data returned is consistent and always more than 7 lines...then a quick and dirty would be a Groovy script like this.
import java.nio.charset.StandardCharsets
FlowFile flowFile = session.get()
if(!flowFile) return
flowFile = session.write(flowFile, {inputStream, outputStream ->
String[] data = inputStream.readLines()
data = data.drop(7)
outputStream.write(data.join("\n").getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS)
Created 11-22-2023 09:49 PM
Thank you, this worked!