Support Questions
Find answers, ask questions, and share your expertise

Spark Performance Tuning

What are the main tuning parameters I can set to make my Spark Streaming application faster? I have a 200 data node cluster with 400 GB RAM on each data node. I have set executor-cores to 5 and executor memory to 20GB and concurrent.tasks to 10. My writes to hive are slow and which is slowing down the whole processing.

3 REPLIES 3

Re: Spark Performance Tuning

Mentor

Can you describe your process of writing to Hive in more detail. Are you leveraging Hive Streaming API? https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest

Tuning number of transactions and batches per write can help you scale writes to Hive.

Re: Spark Performance Tuning

Rising Star

Re: Spark Performance Tuning

Expert Contributor

spark.apache.org has some tuning info for Spark streaming--not for Hive specifically, but maybe the general info will be helpful. The following link is for Spark 2.0.1:

http://spark.apache.org/docs/2.0.1/streaming-programming-guide.html#performance-tuning