Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark on YARN vs Mesos?

SOLVED Go to solution
Highlighted

Spark on YARN vs Mesos?

What are the considerations for running Spark on YARN vs Spark on Mesos?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark on YARN vs Mesos?

@Ali Bajwa

Spark on YARN

4X usage compared to on MESOS*

* http://cdn2.hubspot.net/hubfs/438089/DataBricks_Surveys_-_Content/Spark-Survey-2015-Infographic.pdf?...

Kerberos Support

Dynamic Executor Allocation – Scale up & down

Dynamic Executor Allocation with Data locality (Hortonworks Eng added Data locality to dynamic executor allocation)

Better cluster utilization - More than one Executor per node

Spark on Mesos

No Kerberos Support

No data locality for Dynamic Executor Allocation

Dynamic executor Allocation – Only scale down***

Inefficient cluster utilization – limits one executor per slave****

http://spark.apache.org/docs/latest/running-on-mes...

http://apache-spark-developers-list.1001551.n3.nab...

4 REPLIES 4

Re: Spark on YARN vs Mesos?

@Ali Bajwa

Spark on YARN

4X usage compared to on MESOS*

* http://cdn2.hubspot.net/hubfs/438089/DataBricks_Surveys_-_Content/Spark-Survey-2015-Infographic.pdf?...

Kerberos Support

Dynamic Executor Allocation – Scale up & down

Dynamic Executor Allocation with Data locality (Hortonworks Eng added Data locality to dynamic executor allocation)

Better cluster utilization - More than one Executor per node

Spark on Mesos

No Kerberos Support

No data locality for Dynamic Executor Allocation

Dynamic executor Allocation – Only scale down***

Inefficient cluster utilization – limits one executor per slave****

http://spark.apache.org/docs/latest/running-on-mes...

http://apache-spark-developers-list.1001551.n3.nab...

Re: Spark on YARN vs Mesos?

Super Guru

any updates for late 2016 on this? Spark 2 support? 1.6.2?

Re: Spark on YARN vs Mesos?

A key one is straightforward: HDFS is where the data is. YARN schedules work by that data. YARN clusters are very widely deployed, Spark on YARN lets you run Spark queries against that cluster without you even needing to ask permissions from the cluster opts team. To them, it's just another client job.

Re: Spark on YARN vs Mesos?

Super Guru

any updates for late 2016 on this? Spark 2 support? 1.6.2?