<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Need help on splitting a text on NiFi based on a specific content sequence or word in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181900#M58668</link>
    <description>&lt;P&gt;The SplitContent processor may be what you are looking for: &lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html&lt;/A&gt;. It lets you define a byte sequence to split by.&lt;/P&gt;</description>
    <pubDate>Fri, 31 Mar 2017 13:53:59 GMT</pubDate>
    <dc:creator>Former Member</dc:creator>
    <dc:date>2017-03-31T13:53:59Z</dc:date>
    <item>
      <title>Need help on splitting a text on NiFi based on a specific content sequence or word</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181899#M58667</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I currently have a flow in NiFi that receives flowfiles and routes them based on topic, however every flowfile received in the flow is a bash that contains multiple messages and the number of lines that each message contains can vary so I cannot split by number of lines. Is there a way in NiFi that I can split based on a specific text sequence? The main point of doing this is that I want to know how many messages come inside each bash so if there could be a way to count how many times a specific word happens inside the content of the flowfile or to split the flowfile based on text content it would be really helpful cause based o number of splits I would know how many messages are in each bash. Is there a way to do something like this in NiFi? I am using NiFi version NiFi-1.1.0. Any suggestions would truly be appreciated!&lt;/P&gt;</description>
      <pubDate>Fri, 31 Mar 2017 06:00:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181899#M58667</guid>
      <dc:creator>Adda_Fuentes2</dc:creator>
      <dc:date>2017-03-31T06:00:46Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on splitting a text on NiFi based on a specific content sequence or word</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181900#M58668</link>
      <description>&lt;P&gt;The SplitContent processor may be what you are looking for: &lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html&lt;/A&gt;. It lets you define a byte sequence to split by.&lt;/P&gt;</description>
      <pubDate>Fri, 31 Mar 2017 13:53:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181900#M58668</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2017-03-31T13:53:59Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on splitting a text on NiFi based on a specific content sequence or word</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181901#M58669</link>
      <description>&lt;P&gt;
	As &lt;A rel="user" href="https://community.cloudera.com/users/12522/hbecker.html" nodeid="12522"&gt;@Hellmar Becker&lt;/A&gt; noted, SplitContent allows you to split on arbitrary byte sequences, but if you are looking for a specific word, &lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitText/index.html"&gt;&lt;CODE&gt;SplitText&lt;/CODE&gt;&lt;/A&gt; will also achieve what you want. You may also want to look at &lt;A target="_blank" href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.RouteText/index.html"&gt;&lt;CODE&gt;RouteText&lt;/CODE&gt;&lt;/A&gt;, which allows you to apply a literal or regular expression to every line in the flowfile content and route each individually based on their matching results. &lt;/P&gt;&lt;P&gt;
	Finally, if you only care about the occurrence count of a specific word or sequence in the flowfile, you could use a small script in &lt;CODE&gt;ExecuteScript&lt;/CODE&gt; or even &lt;CODE&gt;ExecuteStreamCommand&lt;/CODE&gt; and use a terminal command like &lt;CODE&gt;$ tr ' ' '\n' &amp;lt; FILE | grep WORD | wc -l&lt;/CODE&gt; (from &lt;A target="_blank" href="http://unix.stackexchange.com/a/2245/54193"&gt;here&lt;/A&gt;). &lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2017 01:27:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Need-help-on-splitting-a-text-on-NiFi-based-on-a-specific/m-p/181901#M58669</guid>
      <dc:creator>alopresto</dc:creator>
      <dc:date>2017-04-04T01:27:26Z</dc:date>
    </item>
  </channel>
</rss>

