- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark 2 - attemptFailuresValidityInterval issue
- Labels:
-
Apache Spark
-
Apache YARN
Created on ‎04-03-2018 07:30 AM - edited ‎09-16-2022 06:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
We are running a spark-submit with options:
--deploy-mode cluster
--conf "spark.yarn.maxAppAttempts=3"
--conf "spark.yarn.am.attemptFailuresValidityInterval=30s"
--conf...
and our application throws intentionally an exception after 70 seconds on the driver, in order to cause a manual failure.
We expected our application, with these parameters, to run forever, because the attemptFailuresValidityInterval should reset the maxAppAttempts counter sooner than the custom exception. But after 3 failures the application stops.
Our installation:
- SPARK2-2.1.0.cloudera2
- CDH 5.11
Any ideas are more than welcome!
Created ‎04-10-2018 11:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, this is a bug described in SPARK-22876 which suggests that the current logic of spark.yarn.am.attemptFailuresValidityInterval is flawed.
While the jira is still being worked on, looking at the comments, I don't foresee a fix anytime soon.
Created ‎04-10-2018 11:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, this is a bug described in SPARK-22876 which suggests that the current logic of spark.yarn.am.attemptFailuresValidityInterval is flawed.
While the jira is still being worked on, looking at the comments, I don't foresee a fix anytime soon.
