Member since
03-07-2016
3
Posts
2
Kudos Received
0
Solutions
02-24-2020
06:14 PM
I tried with spark.streaming.backpressure.pid.minRate which work as expected. My configuration: spark.shuffle.service.enabled: "true"
spark.streaming.kafka.maxRatePerPartition: "600"
spark.streaming.backpressure.enabled: "true"
spark.streaming.concurrentJobs: "1"
spark.executor.extraJavaOptions: "-XX:+UseConcMarkSweepGC"
batch.duration: 5
spark.streaming.backpressure.pid.minRate: 2000 so, by configuration it starts with = 15 (total Number of partitions) X 600 (maxRatePerPartition) X 5 (batch Duration) = 45000 but it doesn't able to process these many records in 5 seconds. It drops to ~10,000 = 2000 (pid.minRate) X 5 (batch duration) So, it spark.streaming.backpressure.pid.minRate is total records per seconds. Just set spark.streaming.backpressure.pid.minRate and leave following config as default spark.streaming.backpressure.pid.integral
spark.streaming.backpressure.pid.proportional
spark.streaming.backpressure.pid.derived
... View more