Member since
09-14-2018
32
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1156 | 10-18-2019 01:09 AM |
11-21-2022
02:24 AM
1 Kudo
Hello @sfdragonstorm & @pacman In the Region Name "img,0006943d-20150504220458043384375D00000002-00093,1527295748538.7b45a9f6f5584fc50b3152d41a5323a2.", the Table Name is "img", StartKey is "0006943d-20150504220458043384375D00000002-00093", Timestamp is "1527295748538" & "7b45a9f6f5584fc50b3152d41a5323a2" is the Region ID. Under HBase Data Directory, each Table Directory would have Region-Level Directories as identified by Region ID ("7b45a9f6f5584fc50b3152d41a5323a2" in Example). The Region ID is an MD5 encoded string for the Region Name & generated by HBase itself. Refer [1], if your Team wish to review the same. Regards, Smarak [1] https://hbase.apache.org/apidocs/src-html/org/apache/hadoop/hbase/client/RegionInfo.html#line.164
... View more
02-01-2022
08:19 PM
Spark uses the blacklist mechanism to enhance the scheduler’s ability to track failures. When a task fails on an executor, the blacklist module tracks the executor and host which has failed to execute the task. Beyond a threshold, the scheduler won't be able to schedule any more tasks on that node. If spark.blacklist.enabled is set to true, we need to always set the value of spark.blacklist.task.maxTaskAttemptsPerNode to greater than spark.task.maxFailures, else the Spark job will fail with the following error message:
ERROR util.Utils: Uncaught exception in thread main
java.lang.IllegalArgumentException: spark.blacklist.task.maxTaskAttemptsPerNode ( = 2) was >= spark.task.maxFailures ( = 2 ). Though blacklisting is enabled, with this configuration, Spark will not be robust to one bad node. Decrease spark.blacklist.task.maxTaskAttemptsPerNode, increase spark.task.maxFailures, or disable blacklisting with spark.blacklist.enabled.
If you try to trigger the following job from shell of edge node:
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --conf spark.blacklist.enabled=true --conf spark.task.maxFailures=2 --conf spark.blacklist.task.maxTaskAttemptsPerNode=2 /opt/cloudera/parcels/CDH-{Version}/jars/spark-examples_{version}.jar 10
will result in the following error with job failed:
ERROR util.Utils: Uncaught exception in thread main java.lang.IllegalArgumentException: spark.blacklist.task.maxTaskAttemptsPerNode ( = 2) was >= spark.task.maxFailures ( = 2 ). Though blacklisting is enabled, with this configuration, Spark will not be robust to one bad node. Decrease spark.blacklist.task.maxTaskAttemptsPerNode, increase spark.task.maxFailures, or disable blacklisting with spark.blacklist.enabled
As clearly mentioned in the log message, the above error happens if you set spark.blacklist.task.maxTaskAttemptsPerNode >= spark.task.maxFailures.
You can resolve the above error by setting spark.task.maxFailures > spark.blacklist.task.maxTaskAttemptsPerNode as follows:
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --conf spark.blacklist.enabled=true --conf spark.task.maxFailures=3 --conf spark.blacklist.task.maxTaskAttemptsPerNode=2 /opt/cloudera/parcels/CDH-{Version}/jars/spark-examples_{version}.jar 10
... View more
Labels:
10-23-2019
08:12 AM
1 Kudo
Hi all, Figured it out and got it working. In addition to needing the /etc/krb5.conf I also needed the hive-site.xml and core-site.xml from the secure CDH cluster. Copied both hive-site.xml and core-site.xml to my local apache-hive-X-bin/conf/ directory. Thanks,
... View more
10-18-2019
01:50 AM
Thanks @CLDR Your solution works! I have try the mehod: 1. tar the slave's /usr/bin which must contain the soft link. 2. copy tar.gz to the master, and it works fine! Thanks a lot!
... View more