Support Questions

Find answers, ask questions, and share your expertise

Benchmark Hortonworks, cloudera and mapR

avatar
Explorer

Hi,

I have to choose between cloudera, hortonworks and mapR.

And i don't know how can i test the performance between those distributions.

After choosing a distribution i have to work with spark and extract data from social networks . So should i just test algorithms with spark in each distribution?

Any help?

Thanks in advance

1 ACCEPTED SOLUTION

avatar

Iobna,

Every vendor ships Apache Spark, so there isn't a difference there.

But there are other differences you can use to evaluate which vendor you choose

  1. As Neeraj mention the difference between the Vendor themselves, all open source or partial open sources or limited open sources.
  2. Which version of Spark is supported in each vendor's distro: HDP with 2.3.4 coming out (this week) supports Spark 1.5.2
  3. Which component of Spark is supported by the vendor (we support SparkCore, ML, Streaming, SQL)
  4. What is the focus on Spark for each vendor (ours is detailed here http://hortonworks.com/blog/spark-hdp-perfect-tog...
  5. How well the vendor can support you (Hortonwork's support is top rated)

There are many other factors to consider, but this should give you some ideas.

Thanks, Vinay

View solution in original post

7 REPLIES 7

avatar
Master Mentor

@lobna tonn Hi Lobna, I highly recommend to do more research on the business model and core technology of these vendors. You can start with a POC (prepare a use case) to load your own data or start with https://github.com/hortonworks/hive-testbench once cluster is up. My linkedin address is in my profile. Please feel free to add me and we can talk about it.

avatar
Explorer

thank you for your reply, do you think that there is no big performance difference between them ? I know that CDH enable native acceleration for some mathematical operations in Spark MLlib, and that it ships the most recent spark version (1.5). should i test for example wordcount algorithm with spark in each distribution ?

avatar
Master Mentor

@lobna tonn Please see this https://hortonworks.com/press-releases/hortonworks-accelerates-spark-at-scale-for-the-enterprise/

Spark 1.5.2 is part of HDP stack. You can try running wordcount but I highly recommend to look into the core business model too.

avatar
Explorer

in the core business model it says that HDP ships Spark 1.3 in HDP 2.3 with a beta preview of 1.5 .And is there a way to pay for support in HDP ?

avatar
Master Mentor

@lobna tonn

You get Spark 1.4.1 when you install HDP 2.3.2 but if you want to upgrade to 1.5.2 then we can help you on that.

Core business model means 100% open source. Read this

avatar

Iobna,

Every vendor ships Apache Spark, so there isn't a difference there.

But there are other differences you can use to evaluate which vendor you choose

  1. As Neeraj mention the difference between the Vendor themselves, all open source or partial open sources or limited open sources.
  2. Which version of Spark is supported in each vendor's distro: HDP with 2.3.4 coming out (this week) supports Spark 1.5.2
  3. Which component of Spark is supported by the vendor (we support SparkCore, ML, Streaming, SQL)
  4. What is the focus on Spark for each vendor (ours is detailed here http://hortonworks.com/blog/spark-hdp-perfect-tog...
  5. How well the vendor can support you (Hortonwork's support is top rated)

There are many other factors to consider, but this should give you some ideas.

Thanks, Vinay

avatar

@lobna tonn how are your tests doing? Did you decide?