Created 11-07-2017 04:36 PM
Hi All,
Thanks a lot to this aweosome community.
We have a listenTCP and listenUDP processors listening for events. We have set "Max Batch Size" to 20000 increase throughput.
However sometime in between messages a new line chacrater is added in a flowfile.
Is it because of Batching Message delimiter? I guess no beause it is happening in between the messages in a flowfile.
any suggestions?
Thanks
Dheeru
Created 11-07-2017 08:53 PM
Can you change whatever is sending the data to ListenUDP to not send a new-line at the end of the message?
If not, how about ReplaceText to replace \n\n with \n?
Created 11-07-2017 08:29 PM
I don't think this could happen with ListenTCP, but with ListenUDP it could happen if the data being received already has a new-line at the end. For example, if you received these two messages "This is message 1\n" and "This is message 2\n" and then you used the batching delimiter of "\n" then you'd get "This is message 1\n\nThis is message 2\n".
Created 11-07-2017 08:48 PM
@Bryan Bende Thanks a lot for the response appreciate it, yes you re absolutely correct about, I have 4 listenudp merging on to 2 merge processors in serial and the puthdfs. I am batching the message for throughput in each of the listenUDP processors and the default matching delimiter is \n so sometimes "This is message 1\n\nThis is message 2\n" this happens.
What I am looking for is this pattern in hdfs
This is message 1
This is message 2
however right now
it is writing to hdfs as
This is message 1
\n
This is message 2
here the extraline takes the additional memory. Any way we can avoid it?
Thanks
Dheery
Created 11-07-2017 08:53 PM
Can you change whatever is sending the data to ListenUDP to not send a new-line at the end of the message?
If not, how about ReplaceText to replace \n\n with \n?
Created 11-07-2017 09:21 PM
@Bryan Bende Thanks for the response, unfortunately I will not be able to change from the source side, but it looks like I will have to use replaceText processor
Thank you appreciate it.
Created 11-07-2017 09:23 PM
You're welcome... one more option is to use ExecuteScript to run a simple processor that reads a flow file line by line and only writes out the lines with length > 0.