<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Separate CSV column by delimiter or whitespace in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Separate-CSV-column-by-delimiter-or-whitespace/m-p/351266#M236199</link>
    <description>&lt;P&gt;I have a csv where one of the columns can sometimes come as a pair such as:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;Column_name&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;1 , 2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;2.5 3.2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;2.9 - 3.2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;The delimiter can potentially be anything really, but there will be a delimiter. Also the values will only come as pairs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question is, is there an efficient way to separate this one column into a flowfile where each value has it's own column, such that I have the following result:&lt;/P&gt;&lt;TABLE border="1" width="66.66666666666667%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Column #1&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Column #2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;2.5&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;3.2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
    <pubDate>Wed, 31 Aug 2022 19:40:29 GMT</pubDate>
    <dc:creator>Jacccs</dc:creator>
    <dc:date>2022-08-31T19:40:29Z</dc:date>
    <item>
      <title>Separate CSV column by delimiter or whitespace</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Separate-CSV-column-by-delimiter-or-whitespace/m-p/351266#M236199</link>
      <description>&lt;P&gt;I have a csv where one of the columns can sometimes come as a pair such as:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;Column_name&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;1 , 2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;2.5 3.2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;2.9 - 3.2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;The delimiter can potentially be anything really, but there will be a delimiter. Also the values will only come as pairs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question is, is there an efficient way to separate this one column into a flowfile where each value has it's own column, such that I have the following result:&lt;/P&gt;&lt;TABLE border="1" width="66.66666666666667%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;Column #1&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;Column #2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%"&gt;2.5&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%"&gt;3.2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2022 19:40:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Separate-CSV-column-by-delimiter-or-whitespace/m-p/351266#M236199</guid>
      <dc:creator>Jacccs</dc:creator>
      <dc:date>2022-08-31T19:40:29Z</dc:date>
    </item>
    <item>
      <title>Re: Separate CSV column by delimiter or whitespace</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Separate-CSV-column-by-delimiter-or-whitespace/m-p/351276#M236203</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;Not sure if this is possible with out of the box processor. I can think of ReplaceText first to replace different delimiters like (-) or white-space (\s) to common delimiter like (,) however if there is a white space before or after other delimiters like (-) or (,) its not going to work. Another option is to use ExecuteScript processor where you try to read each line (after the header) from the flowfile content and then use string split function and try it with different delimiter, once you get two array elements you construct your new string with the new column header and delimiter and transfer to the success relationship.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2022 22:02:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Separate-CSV-column-by-delimiter-or-whitespace/m-p/351276#M236203</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2022-08-31T22:02:00Z</dc:date>
    </item>
  </channel>
</rss>

