<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Backing up Kafka in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/299967#M219960</link>
    <description>&lt;P&gt;Newbie question, apologies. We have a need to backup a Kafka cluster, so that we can restore to a given point in time (as far as possible according to backup granularity) in case of problems, e.g. bad data. Replication would not help here, since bad data could be replicated.&lt;/P&gt;&lt;P&gt;Does anyone out there have such a use case, and did you solve it (with Cloudera or open-source tools)?&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
    <pubDate>Sun, 19 Jul 2020 13:38:09 GMT</pubDate>
    <dc:creator>JEG</dc:creator>
    <dc:date>2020-07-19T13:38:09Z</dc:date>
    <item>
      <title>Backing up Kafka</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/299967#M219960</link>
      <description>&lt;P&gt;Newbie question, apologies. We have a need to backup a Kafka cluster, so that we can restore to a given point in time (as far as possible according to backup granularity) in case of problems, e.g. bad data. Replication would not help here, since bad data could be replicated.&lt;/P&gt;&lt;P&gt;Does anyone out there have such a use case, and did you solve it (with Cloudera or open-source tools)?&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Sun, 19 Jul 2020 13:38:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/299967#M219960</guid>
      <dc:creator>JEG</dc:creator>
      <dc:date>2020-07-19T13:38:09Z</dc:date>
    </item>
    <item>
      <title>Re: Backing up Kafka</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/299984#M219973</link>
      <description>&lt;P&gt;There's an open source tool &lt;A href="https://github.com/itadventurer/kafka-backup" target="_self"&gt;kafka-backup&lt;/A&gt; that sounds like what you are looking for. I'm not sure I follow your granularity point though.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2020 04:47:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/299984#M219973</guid>
      <dc:creator>aakulov</dc:creator>
      <dc:date>2020-07-20T04:47:58Z</dc:date>
    </item>
    <item>
      <title>Re: Backing up Kafka</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300022#M219994</link>
      <description>&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Yes, I came across this kafka-backup when doing searches around this area. But I was hoping that maybe there would have been support from Cloudera itself, as a vendor that wraps Kafka with value-added-services.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regarding granularity, I meant that if I took a backup every six hours, I would presumably be able to return to point-of-time only at that granularity, e.g. to state at 13:00, 19:00, 01:00, 07:00, etc. Unless the backup capability included a continuous log that allowed fine-grained return to point of time.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2020 15:04:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300022#M219994</guid>
      <dc:creator>JEG</dc:creator>
      <dc:date>2020-07-20T15:04:53Z</dc:date>
    </item>
    <item>
      <title>Re: Backing up Kafka</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300024#M219996</link>
      <description>&lt;P&gt;Ok, I get your granularity point. Thanks for clarifying.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unfortunately we don't have a Cloudera supported tool that can do a simple backup of the Kafka cluster. I can only speculate on the reason, but this is likely a rare case where a backup (rather than replication) is required.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2020 15:59:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300024#M219996</guid>
      <dc:creator>aakulov</dc:creator>
      <dc:date>2020-07-20T15:59:56Z</dc:date>
    </item>
    <item>
      <title>Re: Backing up Kafka</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300083#M220024</link>
      <description>&lt;P&gt;You can use Nifi to save your Kafka messages into HDFS (for instance).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Something like this :&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Sans titre.png" style="width: 816px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/28324iB0C113046D54D1BB/image-size/large?v=v2&amp;amp;px=999" role="button" title="Sans titre.png" alt="Sans titre.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;- ConsumeKafka : flowfile content is the Kafka message itself, and you have access to some attributes : topic name, partition, offset, key...(but not timestamp !). When i need it I store the timestamp in the key.&lt;/P&gt;&lt;P&gt;- ReplaceText : build your backup line using flowfile content and attributes&lt;/P&gt;&lt;P&gt;- MergeContent : to build a big file containing multiple Kafka message&lt;/P&gt;&lt;P&gt;- Extracttext : to set attribute to be used as filename&lt;/P&gt;&lt;P&gt;- PutHDFS : to save the created file into HDFS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And you can do the reverse if you need to push it bash to your kafka cluster.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jul 2020 06:27:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Backing-up-Kafka/m-p/300083#M220024</guid>
      <dc:creator>Kezia</dc:creator>
      <dc:date>2020-07-21T06:27:32Z</dc:date>
    </item>
  </channel>
</rss>

