Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Rising Star

Spark uses the blacklist mechanism to enhance the scheduler’s ability to track failures. When a task fails on an executor, the blacklist module tracks the executor and host which has failed to execute the task. Beyond a threshold, the scheduler won't be able to schedule any more tasks on that node. If spark.blacklist.enabled is set to true, we need to always set the value of spark.blacklist.task.maxTaskAttemptsPerNode to greater than spark.task.maxFailures, else the Spark job will fail with the following error message:

ERROR util.Utils: Uncaught exception in thread main
java.lang.IllegalArgumentException: spark.blacklist.task.maxTaskAttemptsPerNode ( = 2) was >= spark.task.maxFailures ( = 2 ). Though blacklisting is enabled, with this configuration, Spark will not be robust to one bad node. Decrease spark.blacklist.task.maxTaskAttemptsPerNode, increase spark.task.maxFailures, or disable blacklisting with spark.blacklist.enabled.

If you try to trigger the following job from shell of edge node:

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --conf spark.blacklist.enabled=true --conf spark.task.maxFailures=2 --conf spark.blacklist.task.maxTaskAttemptsPerNode=2 /opt/cloudera/parcels/CDH-{Version}/jars/spark-examples_{version}.jar 10

will result in the following error with job failed:

ERROR util.Utils: Uncaught exception in thread main java.lang.IllegalArgumentException: spark.blacklist.task.maxTaskAttemptsPerNode ( = 2) was >= spark.task.maxFailures ( = 2 ). Though blacklisting is enabled, with this configuration, Spark will not be robust to one bad node. Decrease spark.blacklist.task.maxTaskAttemptsPerNode, increase spark.task.maxFailures, or disable blacklisting with spark.blacklist.enabled

As clearly mentioned in the log message, the above error happens if you set spark.blacklist.task.maxTaskAttemptsPerNode >= spark.task.maxFailures.

 

You can resolve the above error by setting spark.task.maxFailuresspark.blacklist.task.maxTaskAttemptsPerNode as follows:

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --conf spark.blacklist.enabled=true --conf spark.task.maxFailures=3 --conf spark.blacklist.task.maxTaskAttemptsPerNode=2 /opt/cloudera/parcels/CDH-{Version}/jars/spark-examples_{version}.jar 10
4,526 Views
0 Kudos