<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Benchmark Cloudera, hortonworks and MapR in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35319#M12685</link>
    <description>Yes, I think that begins to narrow it down. I don't know that you're&lt;BR /&gt;going to find a big performance difference, since distributions will&lt;BR /&gt;generally ship the upstream project with only minimal modifications to&lt;BR /&gt;integrate it.&lt;BR /&gt;&lt;BR /&gt;(That said, CDH does let you enable native acceleration for some&lt;BR /&gt;mathematical operations in Spark MLlib. I don't think other distros&lt;BR /&gt;enable this and ship the right libraries. It's possible that could&lt;BR /&gt;matter to your use case.)&lt;BR /&gt;&lt;BR /&gt;I'd look at how recent the Spark distribution is. Cloudera ships Spark&lt;BR /&gt;1.5 in CDH 5.5; MapR is on 1.4 and Hortonworks on 1.3, with a beta&lt;BR /&gt;preview of 1.5 at the moment in both cases. We're already integrating&lt;BR /&gt;the nearly-released Spark 1.6 too.&lt;BR /&gt;&lt;BR /&gt;Finally, if you're considering paying for support, I think it bears&lt;BR /&gt;evaluating how much each vendor invests in Spark. No investment means&lt;BR /&gt;no expertise and no real ability to fix your problems. At Cloudera, we&lt;BR /&gt;have a full-time team on Spark, including 4 committers (including me).&lt;BR /&gt;I think you'll find other vendors virtually non-existent in the Spark&lt;BR /&gt;community, but, go see for yourself.&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Wed, 16 Dec 2015 11:23:28 GMT</pubDate>
    <dc:creator>srowen</dc:creator>
    <dc:date>2015-12-16T11:23:28Z</dc:date>
    <item>
      <title>Benchmark Cloudera, hortonworks and MapR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35313#M12682</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have to choose between cloudera, hortonworks and mapR.&amp;nbsp;&lt;/P&gt;&lt;P&gt;And i don't know how can i test the performance between those distributions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help?&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Wed, 16 Dec 2015 10:04:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35313#M12682</guid>
      <dc:creator>tsunami20</dc:creator>
      <dc:date>2015-12-16T10:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: Benchmark Cloudera, hortonworks and MapR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35317#M12683</link>
      <description>First, you'd have to define what you're trying to "benchmark". I don't&lt;BR /&gt;think these distributions vary in speed; they include reasonably&lt;BR /&gt;different components around the core. That is, it's kind of like&lt;BR /&gt;choosing a car solely by its max RPM or something, even if that's&lt;BR /&gt;important to you.&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Dec 2015 10:46:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35317#M12683</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2015-12-16T10:46:28Z</dc:date>
    </item>
    <item>
      <title>Re: Benchmark Cloudera, hortonworks and MapR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35318#M12684</link>
      <description>Thank you for your reply,&lt;BR /&gt;Actually after choosing a distribution i have to work with spark and&lt;BR /&gt;extract data from social networks .&lt;BR /&gt;So should i just test algorithms with spark in each distribution?&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Dec 2015 11:03:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35318#M12684</guid>
      <dc:creator>tsunami20</dc:creator>
      <dc:date>2015-12-16T11:03:28Z</dc:date>
    </item>
    <item>
      <title>Re: Benchmark Cloudera, hortonworks and MapR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35319#M12685</link>
      <description>Yes, I think that begins to narrow it down. I don't know that you're&lt;BR /&gt;going to find a big performance difference, since distributions will&lt;BR /&gt;generally ship the upstream project with only minimal modifications to&lt;BR /&gt;integrate it.&lt;BR /&gt;&lt;BR /&gt;(That said, CDH does let you enable native acceleration for some&lt;BR /&gt;mathematical operations in Spark MLlib. I don't think other distros&lt;BR /&gt;enable this and ship the right libraries. It's possible that could&lt;BR /&gt;matter to your use case.)&lt;BR /&gt;&lt;BR /&gt;I'd look at how recent the Spark distribution is. Cloudera ships Spark&lt;BR /&gt;1.5 in CDH 5.5; MapR is on 1.4 and Hortonworks on 1.3, with a beta&lt;BR /&gt;preview of 1.5 at the moment in both cases. We're already integrating&lt;BR /&gt;the nearly-released Spark 1.6 too.&lt;BR /&gt;&lt;BR /&gt;Finally, if you're considering paying for support, I think it bears&lt;BR /&gt;evaluating how much each vendor invests in Spark. No investment means&lt;BR /&gt;no expertise and no real ability to fix your problems. At Cloudera, we&lt;BR /&gt;have a full-time team on Spark, including 4 committers (including me).&lt;BR /&gt;I think you'll find other vendors virtually non-existent in the Spark&lt;BR /&gt;community, but, go see for yourself.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Dec 2015 11:23:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35319#M12685</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2015-12-16T11:23:28Z</dc:date>
    </item>
    <item>
      <title>Re: Benchmark Cloudera, hortonworks and MapR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35355#M12686</link>
      <description>is it possible with Spark to handle big data cleansing ?&lt;BR /&gt;</description>
      <pubDate>Thu, 17 Dec 2015 09:49:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Benchmark-Cloudera-hortonworks-and-MapR/m-p/35355#M12686</guid>
      <dc:creator>tsunami20</dc:creator>
      <dc:date>2015-12-17T09:49:28Z</dc:date>
    </item>
  </channel>
</rss>

