<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Split Attributes and pass into different Kafka Topics in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215593#M77085</link>
    <description>&lt;P&gt;I have a file with 36 numeric data fields. Like CPU usage, memory usage, date time, IP address etc. which looks like--&lt;BR /&gt;10.8.x.y, 151490..., 45.00, 95.00, 8979.09, 3984.90, ... (36 fields) &lt;/P&gt;&lt;P&gt;Now there is a stream (GetFile --&amp;gt; Publish_kafka) which is sent to a single kafka topic (Publish Kafka 0_11) &lt;/P&gt;&lt;P&gt;I want to be able to now split this stream into 4 and 30 field two streams and then into two different kafka topics. &lt;/P&gt;&lt;P&gt;That is:&lt;/P&gt;&lt;P&gt;to kafka topic 1;;; &lt;/P&gt;&lt;P&gt;10.8.x.y, 151490..., 45.00,95.00&lt;/P&gt;&lt;P&gt;And to kafka topic 2;;;&lt;/P&gt;&lt;P&gt;8979.09, 3984.90, ... etc. (30 fields)&lt;/P&gt;&lt;P&gt;How do I do that? Split text  just sems to split into line count. &lt;/P&gt;</description>
    <pubDate>Wed, 11 Apr 2018 12:45:00 GMT</pubDate>
    <dc:creator>merlin1</dc:creator>
    <dc:date>2018-04-11T12:45:00Z</dc:date>
    <item>
      <title>Split Attributes and pass into different Kafka Topics</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215593#M77085</link>
      <description>&lt;P&gt;I have a file with 36 numeric data fields. Like CPU usage, memory usage, date time, IP address etc. which looks like--&lt;BR /&gt;10.8.x.y, 151490..., 45.00, 95.00, 8979.09, 3984.90, ... (36 fields) &lt;/P&gt;&lt;P&gt;Now there is a stream (GetFile --&amp;gt; Publish_kafka) which is sent to a single kafka topic (Publish Kafka 0_11) &lt;/P&gt;&lt;P&gt;I want to be able to now split this stream into 4 and 30 field two streams and then into two different kafka topics. &lt;/P&gt;&lt;P&gt;That is:&lt;/P&gt;&lt;P&gt;to kafka topic 1;;; &lt;/P&gt;&lt;P&gt;10.8.x.y, 151490..., 45.00,95.00&lt;/P&gt;&lt;P&gt;And to kafka topic 2;;;&lt;/P&gt;&lt;P&gt;8979.09, 3984.90, ... etc. (30 fields)&lt;/P&gt;&lt;P&gt;How do I do that? Split text  just sems to split into line count. &lt;/P&gt;</description>
      <pubDate>Wed, 11 Apr 2018 12:45:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215593#M77085</guid>
      <dc:creator>merlin1</dc:creator>
      <dc:date>2018-04-11T12:45:00Z</dc:date>
    </item>
    <item>
      <title>Re: Split Attributes and pass into different Kafka Topics</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215594#M77086</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/39327/merlin.html" nodeid="39327" target="_blank"&gt;@Merlin Sundar&lt;/A&gt;&lt;P&gt;One way of doing this is by using publish kafka record processor which can read the incoming flowfile data and writes the required fileds into topic.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt; Fork the success&lt;/STRONG&gt; relation from &lt;STRONG&gt;GetFile processo&lt;/STRONG&gt;r and then use &lt;STRONG&gt;two Publish Kafka Record processors&lt;/STRONG&gt; to Publish messages into Kafka topic1 and kafka topic2.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Flow:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="68466-kafka.png" style="width: 1792px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16307iC13414AF86C1E241/image-size/medium?v=v2&amp;amp;px=400" role="button" title="68466-kafka.png" alt="68466-kafka.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;First PublishKafkaRecord processor&lt;/STRONG&gt; configure the Record reader as CsvReader to read your incoming file(with 36 fields) and Record Writer as CsvSetWriter to write only to &lt;STRONG&gt;required four fields to KafkaTopic1&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Second PublishKafkaRecord processor&lt;/STRONG&gt; configure the Record reader as &lt;STRONG&gt;Csv Reader same as above to read your 36 fields&lt;/STRONG&gt; and Record Writer as CsvSetWriter to write only to required&lt;STRONG&gt; thirty fields to KafkaTopic2&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;By using this way we are splitting the fields and publishing them to the desired kafka topics.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;(or)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;if you want to publish &lt;STRONG&gt;all the fields into one topic using publishkafka_0_11&lt;/STRONG&gt; processor then want to split into two, then&lt;/P&gt;&lt;P&gt;Use ConsumeKafka Processor to consume the 36 fileds message and use two PublishKafkaRecord_0_11 processors with Same CsvReader as record reader and different CSVSetWriters with 4fields and 30fileds.&lt;/P&gt;&lt;P&gt;Now we are Consuming the messages and writing them to two different kafka topics that are configured with different csvsetwriter controller services.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Flow:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="68468-publishkafka-approach1.png" style="width: 2260px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16308i24C4C0C01A4F540F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="68468-publishkafka-approach1.png" alt="68468-publishkafka-approach1.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;(or)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;if you want to publish &lt;STRONG&gt;all the fields into one topic using publishkafka_0_11&lt;/STRONG&gt; processor then want to split into two, then&lt;/P&gt;&lt;P&gt;Once you publish the messages to Kafka topic by using Publish Kafka 0_11 processor then use &lt;BR /&gt;Consume kafka processor to consume the published kafka messages from the topic.&lt;/P&gt;&lt;P&gt;Then use Two Convert record processors in parallel with same csv reader controller service and two different Csv Set Writer controller services because we need to write 4 fields to kafka1 topic and 30 fields to kafka2 topic.&lt;/P&gt;&lt;P&gt;then use two &lt;STRONG&gt;PublishKafka_0_11 processors&lt;/STRONG&gt; in parallel to publish the prepared messages from the two convertrecord processors.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Flow:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="68467-publishkafka-approach2.png" style="width: 2145px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16309i6E2E38B33A65E4B9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="68467-publishkafka-approach2.png" alt="68467-publishkafka-approach2.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;As you can choose which is the best fit for your case from the above three approaches.&lt;/P&gt;&lt;P&gt;Please refer to below links to configure/use the Record Reader and Record writer properties&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/articles/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/articles/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 03:17:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215594#M77086</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T03:17:34Z</dc:date>
    </item>
    <item>
      <title>Re: Split Attributes and pass into different Kafka Topics</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215595#M77087</link>
      <description>&lt;P&gt;I too tried something in the meanwhile. Here is the screenshot of the flow. In this flow I simple split them using regular expression and then extracted what was needed using the success connectors. Will surely try one of the methods mentioned by you as well and get back here. &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="68478-screenshot-from-2018-04-16-101520.png" style="width: 1366px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16306i528F69220CD8569F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="68478-screenshot-from-2018-04-16-101520.png" alt="68478-screenshot-from-2018-04-16-101520.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 03:17:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Split-Attributes-and-pass-into-different-Kafka-Topics/m-p/215595#M77087</guid>
      <dc:creator>merlin1</dc:creator>
      <dc:date>2019-08-18T03:17:15Z</dc:date>
    </item>
  </channel>
</rss>

