Reply
Explorer
Posts: 10
Registered: ‎01-31-2017

What's the right number of cores and executors for a spark streaming application?

I've a spark streaming application that reads from 4 different kafka topics and each topic has 3 partitions. Reading operation is done in different instants (I have 4 pipeline processed in sequence) so in my idea I need just 3 spark executor (one for each partition of each topic) with one core each. Submitting the application in this way I can see that execution is not parallelized between executor and processing time is very high respect to the complexity of the computation. What's wrong with this assumption?

If I run the same application with 4 executors with 4 cores each the execution is parallelized through all the executors and processig time is low.

I'm wondering if exists a best practices in terms of executor for topic/partition and cores while consuming from a kafka topic with spark streaming.

 

Thanks in advance,

Beniamino

Cloudera Employee
Posts: 481
Registered: ‎08-11-2014

Re: What's the right number of cores and executors for a spark streaming application?

If you have 4 topics with 3 partitions each then you need 12 executor slots to process fully in parallel. You have only 3 slots. If you are using receiver based streaming you may need 1 more, too.

Also, 1 core per executor is generally very low.

Your result is therefore not surprising and your second config much more reasonable.
Explorer
Posts: 10
Registered: ‎01-31-2017

Re: What's the right number of cores and executors for a spark streaming application?

 

I'm using directStream and topics are read one by one so I was thinking that 3 tasks were enough.

Strange thing is that I'm observing a different behavior running the same application on another cluster.

The second cluster is smaller than the first, it has 3 brokers instead of 4. In order to reach good performance I need to run the application with 6 executors with 1 core each and I can see that only 3 executors receive the work.

 

The described scenario could be related to the architecture of the cluster?

 

Thanks again,

Beniamino

New Contributor
Posts: 1
Registered: ‎09-02-2018

Re: What's the right number of cores and executors for a spark streaming application?

[ Edited ]

@srowen Is 12 executors really necessary?  Surely you just need a total of 12 cores (so you could have 1 executor with 12 cores).

 

Is this what you mean by "Also, 1 core per executor is generally very low."?

 

What happens when you have more cores than kafka partitions? will it generall run faster?