<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Kafka using Docker for production clusters in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281900#M209645</link>
    <description>&lt;P&gt;you mentioned the HDF kit&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;until now we works with HDP and ambari&lt;/P&gt;&lt;P&gt;dose HDF is the same concept as HDP ? ( include the blueprint in case we want to automate the installation process ? )&lt;/P&gt;</description>
    <pubDate>Sat, 02 Nov 2019 23:57:50 GMT</pubDate>
    <dc:creator>mike_bronson7</dc:creator>
    <dc:date>2019-11-02T23:57:50Z</dc:date>
    <item>
      <title>Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281565#M209441</link>
      <description>&lt;P&gt;We need to build a Kafka production cluster with 3-5 nodes in cluster ,&lt;/P&gt;
&lt;P&gt;We have the following options:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Kafka in Docker containers (Kafka cluster include zookeeper and schema registry on each node)&lt;/LI&gt;
&lt;LI&gt;Kafka cluster not using docker (Kafka cluster include zookeeper and schema registry on each node)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Since we are talking on production cluster we need good performance as we have high read/write to disks (disk size is 10T), good IO performance, etc.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;So does Kafka using Docker meet the requirements for productions clusters?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;more info - &lt;A href="https://www.infoq.com/articles/apache-kafka-best-practices-to-optimize-your-deployment/" target="_blank" rel="noopener"&gt;https://www.infoq.com/articles/apache-kafka-best-practices-to-optimize-your-deployment&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Oct 2019 14:22:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281565#M209441</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-10-29T14:22:33Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281579#M209454</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Your plans are doable and that's the way many companies have deployed their Kafka production clusters if you intend ONLY to use Kafka, but you could take it a step further by enabling HA and reliability but orchestrating all that with Kubernetes with PVC's it's a great idea.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Running kafka as a microservices on Kubernetes has become the norm and the path of least resistance. It is very difficult to allocate physical machines with local disks for Kafka companies running on VMs have found deploying Kafka outside of Kubernetes causes significant organizational headache.&lt;/P&gt;&lt;P&gt;Running Kafka on Kubernetes gets your environment allocated faster and you can use your time to do productive work rather than fire fighting. Kafka management becomes much easier on kubernetest becomes easier to scaleup adding new brokers is a single command or a single line in a configuration file. And it is easier to perform configuration changes, upgrades and restarts on all brokers and all clusters.&lt;BR /&gt;Kafka is a stateful service, and this does make the Kubernetes configuration more complex than it is for stateless microservices. The biggest challenge is configuring storage and network, and you’ll want to make sure both subsystems deliver consistent low latency that where PVC's [Persistent Volume claims] come in use of shared storage.&lt;BR /&gt;The beauty is Kafka will run like a &lt;STRONG&gt;POD&lt;/STRONG&gt; and you can configure a fixed number that MUST be running at any time and scale when needed with a single &lt;STRONG&gt;Kubectl&lt;/STRONG&gt; or &lt;STRONG&gt;HELM&lt;/STRONG&gt; command is elasticity at play !!&lt;/P&gt;&lt;P&gt;Kafka also poses a challenge most stateful services don’t Brokers are not interchangeable, and clients will need to communicate directly with the broker that contains the lead replica of each partition they produce to or consume from. You can’t place all brokers behind a single load balancer address you must devise a way to route messages to a specific broker here is a good reading &lt;A href="https://www.confluent.io/confluent-operator/" target="_blank" rel="noopener"&gt;Recommendations for Deploying Apache Kafka on Kubernetes paper&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Happy hadooping&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Oct 2019 18:49:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281579#M209454</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2019-10-29T18:49:00Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281669#M209512</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;just want to say first thank you for all explain&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but for now we cant work with&amp;nbsp;&lt;SPAN&gt;Kubernetes&amp;nbsp; ( because some internal reasons )&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;so the option is to work with docker&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;based on that - do you think kafka cluster using docker will have less performance&amp;nbsp;then kafka cluster without docker ?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Oct 2019 19:27:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281669#M209512</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-10-30T19:27:25Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281680#M209519</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Docker containers provide an ideal foundation for running Kafka-as-a-Service on-premises or in the public cloud. However, using Docker containers in production environments poses some challenges including container management, scheduling, network configuration and security, and performance&lt;/P&gt;&lt;P&gt;Containerized applications have no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows and also each container’s access to the host machine’s CPU cycles is unlimited.&lt;BR /&gt;It is important not to allow a running container to consume too much of the host machine’s memory&lt;/P&gt;&lt;P&gt;As you are aware kafka will need Zookeepers so you got to architect well your Kafka deployment but once you master then its a piece of cake and it brings a lot of advantages like upgrades, scaling out, etc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As I reiterated that a good move get you hands dirty &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Oct 2019 20:32:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281680#M209519</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2019-10-30T20:32:21Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281684#M209523</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;just copy what you said "&lt;SPAN&gt;some challenges including container management, scheduling, network configuration and security, and performance"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;so I am understand&amp;nbsp;that you think containers can give&amp;nbsp; negative&amp;nbsp;aspects about&amp;nbsp;&lt;/SPAN&gt;performance&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the question is if this is very minor affect or maybe major affect on performance&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;as I mentions we have two choices&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;install kafka cluster from confluent with zoo and schema registry&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;OR&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;install kafka using docker with zoo and&amp;nbsp;schema registry&amp;nbsp; from confluent&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;third choice is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;install kafka cluster from HDF Kit ( with kafka + zoo + schema registry )&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;please give your professional opinion&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;what is the best kafka cluster from these three options?&amp;nbsp; ( when focusing on performance side / production env)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Oct 2019 21:24:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281684#M209523</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-10-30T21:24:18Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281798#M209589</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Confluent and Kafka are inseparable &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; HDF also has good tooling around Kafka but what you decide on usually depends on the skillsets at hand. Containerized apps are now the norm with reasons as shared before but nevertheless, HDF 3.1 is package with SAM, Nifi, Ambari, Registry and Ranger quite a complete offering.&lt;/P&gt;&lt;P&gt;But the Dockerized version you have too many moving parts and synchronizing Kafka; zookeeper and registry could be a challenge without the good skillsets but the positive side goes to upgrades and deployment and portability OS agnostic.&lt;/P&gt;&lt;P&gt;The choice is yours &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 20:13:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281798#M209589</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2019-10-31T20:13:23Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281900#M209645</link>
      <description>&lt;P&gt;you mentioned the HDF kit&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;until now we works with HDP and ambari&lt;/P&gt;&lt;P&gt;dose HDF is the same concept as HDP ? ( include the blueprint in case we want to automate the installation process ? )&lt;/P&gt;</description>
      <pubDate>Sat, 02 Nov 2019 23:57:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281900#M209645</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-11-02T23:57:50Z</dc:date>
    </item>
    <item>
      <title>Re: Kafka using Docker for production clusters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281909#M209653</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes, it's possible to deploy HDF using Ambari blueprints.&amp;nbsp; If you compared an HDP and HDF blueprint you will notice a difference in the components section only.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/Automate-deployment-of-HDF-3-1-clusters-using-Ambari/ta-p/247802" target="_blank" rel="noopener"&gt;Deploy HDF 1 using a blueprint&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/seanorama/ambari-bootstrap.git" target="_blank" rel="noopener"&gt;Deploy HDF 2 using a blueprint&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://gist.github.com/abajwa-hw/ae4125c5154deac6713cdd25d2b83620" target="_blank" rel="noopener"&gt;Deploy HDF 3 using a blueprint&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;Above are some links that show the possibility&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 14:47:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Kafka-using-Docker-for-production-clusters/m-p/281909#M209653</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2019-11-03T14:47:59Z</dc:date>
    </item>
  </channel>
</rss>

