<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Multiple ReplaceText Processors in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375684#M242592</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/106702"&gt;@JohnnyRocks&lt;/a&gt;, as &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95503"&gt;@steven-matison&lt;/a&gt; said, you should avoid linking so many ReplaceText.&lt;BR /&gt;I am not quite sure I understood your flow exactly, but something tells me that before reaching ReplaceText, something is not properly configured in your NiFi Flow.&lt;BR /&gt;&lt;BR /&gt;First of all, when using the classic Java Data Format, MM will always transpose in a two digit month, meaning that month from 1 to 9 will be automatically appended with a leading zero. "dd" will do the same trick but for days. As I see in your post, you said that your CSV reader is configured to read the data as MM/dd/yy, which should be fine, but somehow something is missing here ---&amp;gt; How do you reach the format of dd/MM/yyyy?&lt;BR /&gt;&lt;BR /&gt;What I would personally try to do is to convert all those date values in the same format. So instead of all those ReplaceText, I would try to insert an UpdateRecord Processor, where I would define my RecordReader and my RecordWritter with the desired schemas (make sure that your column is type int with logicaly type date). Next, in that processor, I would change the Replacement Value Strategy into "Record Path Value" and I would press on + and add a new property. I would call it "&lt;EM&gt;/Launch_Date&lt;/EM&gt;" (pay attention to the leading slash) and I would assign it the value " &lt;EM&gt;format( /Launch_Date, "dd/MM/yyyy", "Europe/Bucharest")&lt;/EM&gt; " (or any other timezone you require -- if you require your data in UTC, just remove the coma and the timezone).&lt;/P&gt;</description>
    <pubDate>Mon, 28 Aug 2023 15:05:26 GMT</pubDate>
    <dc:creator>cotopaul</dc:creator>
    <dc:date>2023-08-28T15:05:26Z</dc:date>
    <item>
      <title>Multiple ReplaceText Processors</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375635#M242560</link>
      <description>&lt;P&gt;Cloudera,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am working on a NiFi flow in which incoming dates are treated as follows in my input schema:&lt;/P&gt;&lt;P&gt;{ "name": "Launch_Date", "type": ["null",{ "type" : "int", "logicalType" : "date"}] },&lt;/P&gt;&lt;P&gt;And my CSVReader uses the following date format:&amp;nbsp; MM/dd/yy&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When records are being ingested, I am getting Java errors on the month and day portion of dates that are single digit, such as 6/23/2019 and 11/3/2019, so I used the ReplaceText processor to add leading zeroes (06./23/2019, 11/03/2019).&amp;nbsp; What I was not able to figure out was if/how I can do three date conversions in one ReplaceText processor ( I also have to add leading zeroes to dates like 6/3/2019 so they become dates like 06/03/2019).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, I have three ReplaceText processors, that do Regex replacements on the following cases:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;dates w/ single month digit&lt;/LI&gt;&lt;LI&gt;dates w/ single day digit&lt;/LI&gt;&lt;LI&gt;dates w/ single month and single day digits&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;In all three cases, zeroes are prepended to the single digits.&amp;nbsp; The portion of my flow with the three ReplaceText processors is shown below.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="JohnnyRocks_1-1693171695797.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/38310i0261694AF80D2A3E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="JohnnyRocks_1-1693171695797.png" alt="JohnnyRocks_1-1693171695797.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Is there a way to do all three cases in one ReplaceText processor?&amp;nbsp; Or should I be handling the dates a different way so that I do not get Java errors on dates with single MM/dd digits?&amp;nbsp; What I have done works fine, it just seems like I should be able to use one processor instead of three.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;JohnnyRocks&lt;/P&gt;</description>
      <pubDate>Sun, 27 Aug 2023 21:31:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375635#M242560</guid>
      <dc:creator>JohnnyRocks</dc:creator>
      <dc:date>2023-08-27T21:31:27Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple ReplaceText Processors</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375637#M242561</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/106702"&gt;@JohnnyRocks&lt;/a&gt; ,&lt;/P&gt;&lt;P&gt;Its hard for me to suggest a solution without seeing how the input looks like, however based on the schema you provided and assuming that you are dealing with dates only values, I tested the following config in the ReplaceText Processor and it appears to work on all cases:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="SAMSAL_0-1693174312600.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/38311i4FD1427D7A5FDC3F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="SAMSAL_0-1693174312600.png" alt="SAMSAL_0-1693174312600.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Search Value: &lt;STRONG&gt;(\b\d/)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Replacement Value: &lt;STRONG&gt;0$1&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If that doesnt help can you provide sample input and the different cases for search and replace values in the three replaceText processors.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If that helps please &lt;STRONG&gt;accept&lt;/STRONG&gt; solution.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sun, 27 Aug 2023 22:14:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375637#M242561</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2023-08-27T22:14:41Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple ReplaceText Processors</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375673#M242584</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/106702"&gt;@JohnnyRocks&lt;/a&gt;&amp;nbsp;ReplaceText more than once is something you want to avoid entirely.&amp;nbsp; &amp;nbsp;You need to look at how to solve the schema concerns within the record based processors.&amp;nbsp; It should be possible to avoid ReplaceText all together.&amp;nbsp; &amp;nbsp;If your upstream data is &lt;EM&gt;&lt;STRONG&gt;that&lt;/STRONG&gt; &lt;/EM&gt;different (3 different formats) within the same pipeline, consider how to address that upstream or in separate nifi flows.&amp;nbsp; &amp;nbsp; Alternatively multiple pipelines can be built with separate top branch that pipes into the same record based processor.&amp;nbsp; &amp;nbsp; &amp;nbsp;This would be something like 3 single routes through a ReplaceText then all going to ConvertRecord.&amp;nbsp; However i would still try to optimize without ReplaceText in the manner described here.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2023 12:31:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375673#M242584</guid>
      <dc:creator>steven-matison</dc:creator>
      <dc:date>2023-08-28T12:31:52Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple ReplaceText Processors</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375684#M242592</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/106702"&gt;@JohnnyRocks&lt;/a&gt;, as &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95503"&gt;@steven-matison&lt;/a&gt; said, you should avoid linking so many ReplaceText.&lt;BR /&gt;I am not quite sure I understood your flow exactly, but something tells me that before reaching ReplaceText, something is not properly configured in your NiFi Flow.&lt;BR /&gt;&lt;BR /&gt;First of all, when using the classic Java Data Format, MM will always transpose in a two digit month, meaning that month from 1 to 9 will be automatically appended with a leading zero. "dd" will do the same trick but for days. As I see in your post, you said that your CSV reader is configured to read the data as MM/dd/yy, which should be fine, but somehow something is missing here ---&amp;gt; How do you reach the format of dd/MM/yyyy?&lt;BR /&gt;&lt;BR /&gt;What I would personally try to do is to convert all those date values in the same format. So instead of all those ReplaceText, I would try to insert an UpdateRecord Processor, where I would define my RecordReader and my RecordWritter with the desired schemas (make sure that your column is type int with logicaly type date). Next, in that processor, I would change the Replacement Value Strategy into "Record Path Value" and I would press on + and add a new property. I would call it "&lt;EM&gt;/Launch_Date&lt;/EM&gt;" (pay attention to the leading slash) and I would assign it the value " &lt;EM&gt;format( /Launch_Date, "dd/MM/yyyy", "Europe/Bucharest")&lt;/EM&gt; " (or any other timezone you require -- if you require your data in UTC, just remove the coma and the timezone).&lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2023 15:05:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Multiple-ReplaceText-Processors/m-p/375684#M242592</guid>
      <dc:creator>cotopaul</dc:creator>
      <dc:date>2023-08-28T15:05:26Z</dc:date>
    </item>
  </channel>
</rss>

