Member since
04-07-2017
10
Posts
0
Kudos Received
0
Solutions
01-30-2018
04:45 AM
Sorry this is nearly a year later, but the behavior you're seeing is likely that spark.executor.instances and --num-executors no longer disables dynamic allocation in 2.X, we now have to explicitly set spark.dynamicAllocation.enabled to false otherwise it just uses the value in the aforementioned properties as the initial executors but still continues to use dynamic allocation to scale the count up and down. That may or may not explain your situation as you mentioned playing with some of those properties. Additionally the remaining ~13,000 tasks you describe doesn't necessarily mean that there are 13,000 pending tasks, a large portion of those could be for future stages that depend on the current stage and when you're seeing the number of executors reduced it's likely that the current stage was not using all the available executors and they reached the idle limit and were released. You will want to explicitly disable dynamic allocation if you want a static number of executors, and likely want to review if there's a low task count at the time of "decay" and look at the data to figure out why which could potentially be resolved by simply performing a repartition of the RDD/DS/DF in the stage that has the low level of partitions. Alternatively there could be resource management configuration or perhaps an actual bug related to the behavior but I would start with the assumption that it's related to the config, data, or partitioning.
... View more
01-08-2018
11:36 AM
Thanks Ben. Also, do we have any updates that are announced by Cloudera in terms of CDH upgrade required for Meltdown or Spectre apart from OS patches. Thanks!!
... View more
07-12-2017
08:24 AM
My issue here ended up being brought by Amazon elastic load balancer timing out after 60 seconds.
... View more
04-25-2017
12:15 PM
Thanks @aarman; that worked wonderfully!
... View more