Created 12-08-2015 05:29 AM
What are the considerations for running Spark on YARN vs Spark on Mesos?
Created 12-08-2015 12:23 PM
Spark on YARN
4X usage compared to on MESOS*
Kerberos Support
Dynamic Executor Allocation – Scale up & down
Dynamic Executor Allocation with Data locality (Hortonworks Eng added Data locality to dynamic executor allocation)
Better cluster utilization - More than one Executor per node
Spark on Mesos
No Kerberos Support
No data locality for Dynamic Executor Allocation
Dynamic executor Allocation – Only scale down***
Inefficient cluster utilization – limits one executor per slave****
Created 12-08-2015 12:23 PM
Spark on YARN
4X usage compared to on MESOS*
Kerberos Support
Dynamic Executor Allocation – Scale up & down
Dynamic Executor Allocation with Data locality (Hortonworks Eng added Data locality to dynamic executor allocation)
Better cluster utilization - More than one Executor per node
Spark on Mesos
No Kerberos Support
No data locality for Dynamic Executor Allocation
Dynamic executor Allocation – Only scale down***
Inefficient cluster utilization – limits one executor per slave****
Created 10-18-2016 03:34 PM
any updates for late 2016 on this? Spark 2 support? 1.6.2?
Created 12-09-2015 07:17 PM
A key one is straightforward: HDFS is where the data is. YARN schedules work by that data. YARN clusters are very widely deployed, Spark on YARN lets you run Spark queries against that cluster without you even needing to ask permissions from the cluster opts team. To them, it's just another client job.
Created 10-18-2016 03:34 PM
any updates for late 2016 on this? Spark 2 support? 1.6.2?