Support Questions

Find answers, ask questions, and share your expertise

ReplaceText Processor not processing all Expresssion Language

avatar
New Contributor

Hi,

I want to use the replace text processor to replace some text. Main goals is to escape JSON, however, it seems that both escapeJson() and replaceText() don't work. Other expressions (like notNull and substring) do seem to work.

The replaceText processor has the following settings configured:

  • Search value: (?s)(^.*$)
  • Replacement value: ${'$1':escapeJson()}
  • Replacement strategy: Regex replace
  • Evaluation mode: Entire text

I noticed that when I use the notNull() expression, it replaces the text with True, and when i use substring(0, 5) it gives me the first 5 characters. However, when using escapeJson() or replace('"', '\\"') it doesn't replace quotes or slashes.

An example text value that I used:

https 2018-03-18T23:55:36.990541Z app/abc-pxx-p001/abc123 12.34.56.78:12345 87.65.43.21:80 0.000 0.092 0.000 200 200 1124 364 "GET https://mysite.nl:443/test HTTP/1.1" "abc/1 CFN/987 Darwin/12.3.4" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:eu-west-1:012345679:targetgroup/abc-external-abc-de/abc123def456 "Root=1-5a-d2a18df071c" "mysite.nl" "session-reused" 0
4 REPLIES 4

avatar
Master Guru

@Xander L

Method1:-

You can add new extract text processor before replace text processor and keep the whole content of the flowfile as attribute and use the extracted attribute in Replace Text processor to get escapeJson function.

Extract text Processor Configs:-

65391-extracttext.png

Add new property by clicking on + sign at top right corner

in my case i have added extract property with below regex

extract

(.*)

Now we are going to extract all the content of the flowfile and keep that content as extract attribute to the flowfile.

You need to increase the below properties sizes if you are having big content to extract

Maximum Buffer Size

1 MB

Maximum Capture Group Length

1024

Output from extract text processor:-

65393-output-extracttext.png

Replace text Configs:-

65392-replacetext.png

Change the configs according to the screenshot,

Search Value

(?s)(^.*$)

Replacement Value

${extract:escapeJson()}

Character Set

UTF-8

Maximum Buffer Size

1 MB //needs to change if the size of flowfile is more than 1 MB

Replacement Strategy

Always Replace

Evaluation Mode

Entire text

Now we are applying escapeJson function to the extract attribute as a replacement value.

Output:-

https 2018-03-18T23:55:36.990541Z app\/abc-pxx-p001\/abc123 12.34.56.78:12345 87.65.43.21:80 0.000 0.092 0.000 200 200 1124 364 \"GET https:\/\/mysite.nl:443\/test HTTP\/1.1\" \"abc\/1 CFN\/987 Darwin\/12.3.4\" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:eu-west-1:012345679:targetgroup\/abc-external-abc-de\/abc123def456 \"Root=1-5a-d2a18df071c\" \"mysite.nl\" \"session-reused\" 0

(or)

Method2:-

By using two Replace text processor in series we can have same expected result instead of using extract text processor.

Replace / with \/:-

Search Value

/

Replacement Value

\/

Character Set

UTF-8

Maximum Buffer Size

1 MB

Replacement Strategy

Literal Replace

Evaluation Mode

Entire text

65394-replacetext-1.png

in this processor we are searching for / literal and replacing / with \/.

feed the success relation from this replacetext processor to the next replace text processor

Replace text processor 2 for replace " with \":-

65395-replacetext-2.png

Search Value

"

Replacement Value

\"

Character Set

UTF-8

Maximum Buffer Size

1 MB

Replacement Strategy

Literal Replace

Evaluation Mode

Entire text

In the next replace text processor we are searching for " and replacing " with \" by using literal replace as Replacement strategy.

Output flowfile content from second replace text processor will be same as our method 1.

https 2018-03-18T23:55:36.990541Z app\/abc-pxx-p001\/abc123 12.34.56.78:12345 87.65.43.21:80 0.000 0.092 0.000 200 200 1124 364 \"GET https:\/\/mysite.nl:443\/test HTTP\/1.1\" \"abc\/1 CFN\/987 Darwin\/12.3.4\" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:eu-west-1:012345679:targetgroup\/abc-external-abc-de\/abc123def456 \"Root=1-5a-d2a18df071c\" \"mysite.nl\" \"session-reused\" 0

Both ways we are getting same expected result, choose the way which can better fit for your case.

Let us know if you are having any issues..!!

.

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

avatar
New Contributor

@Shu Thanks for your answer. I understand that there are ways to work around this issue. I also did that using the second method. However, isn't this simply a bug? The module states it is able to work with the NiFi expression language, so why is it not working for all expressions?

Shouldn't this be taken care of? That way we don't have to work around this issue.

avatar
New Contributor

using nifi 1.5 ReplaceText Replacement value: ${'$1':replace('"', '\\\\"')} works

avatar

Thank you! I was searching for the same thing and wondering why that was not working... I simply added the single quotes and now that's ok.