Created 09-26-2025 05:43 AM
Looking for help with Cloudera Data Platform Spark job optimization. We're running large-scale ETL jobs that are timing out during the shuffle phase, consuming excessive cluster resources and causing memory spillage.
The jobs process ~500GB datasets but execution times have increased 3x after migrating to CDP. Need someone experienced with Spark tuning on Cloudera and YARN resource management to identify bottlenecks.
Seeking 3-4 hours remote performance analysis to optimize job configuration and cluster settings. Must be resolved by Tuesday for our data pipeline SLA.
Created 09-26-2025 10:01 AM
@Brenda99 Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our CDP experts @venkatsambath @abdulpasithali @upadhyayk04 who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 09-29-2025 05:23 AM
Hi Brenda,
I have extensive experience optimizing Spark jobs on CDP and dealing with shuffle-heavy workloads.
Your timeout issues during shuffle phase sound like a combination of executor memory settings and partition strategy problems, pretty common after CDP migrations actually.
I can help diagnose the bottlenecks through Spark UI analysis and YARN logs, then tune your spark.sql.shuffle.partitions and executor configurations.
Have successfully reduced similar ETL job runtimes by 60-70% for other clients facing post-migration performance degradation.
You can reach out to me on my email here
Colin
Created 10-06-2025 01:18 PM
Hello @Brenda99,
The question is very wide, there are many things that can help to improve the performance.
Some basic recomendations are documented here:
https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/tuning-spark/topics/spark-admin_spark_tuning....
Take a look on the documentation, that could help you.
Also, it will worth to talk with the team in charge of your account to found deeper performance tuning analysis.