Member since
09-29-2025
1
Post
0
Kudos Received
0
Solutions
09-29-2025
05:23 AM
Hi Brenda, I have extensive experience optimizing Spark jobs on CDP and dealing with shuffle-heavy workloads. Your timeout issues during shuffle phase sound like a combination of executor memory settings and partition strategy problems, pretty common after CDP migrations actually. I can help diagnose the bottlenecks through Spark UI analysis and YARN logs, then tune your spark.sql.shuffle.partitions and executor configurations. Have successfully reduced similar ETL job runtimes by 60-70% for other clients facing post-migration performance degradation. You can reach out to me on my email here Colin
... View more