Member since
09-28-2017
88
Posts
3
Kudos Received
0
Solutions
07-28-2022
06:00 AM
Hi, If this is not working from any of the hosts, I believe you need to try the below URLs for 3.1.0 https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/hdp_31_repositories.html Just please bear in mind that the repositories are now behind a paywall, for which you'll have to provide credentials. https://www.cloudera.com/downloads/paywall-expansion.html
... View more
02-25-2021
12:18 AM
bumping again, still trying to disable all application core dump that run inside the containers
... View more
09-08-2019
04:28 PM
@ilia987, The message "Driver grid-05.test.com:36315 disassociated! Shutting down" sounds like AM had trouble getting back to Driver, can you share below info: - did you run spark in cluster or client mode? - what is the full command? - what's the error from client side where you ran spark-submit? - what's the error in yarn side? As suggested by @AKR to share the entire application logs Cheers Eric
... View more
04-02-2019
09:06 AM
You need to follow these as those are for spark thrift Configuring Cluster Dynamic Resource Allocation Manually To configure a cluster to run Spark jobs with dynamic resource allocation, complete the following steps: Add the following properties to the spark-defaults.conf file associated with your Spark installation (typically in the $SPARK_HOME/conf directory): Set spark.dynamicAllocation.enabled to true . Set spark.shuffle.service.enabled to true . (Optional) To specify a starting point and range for the number of executors, use the following properties: spark.dynamicAllocation.initialExecutors spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors Note that initialExecutors must be greater than or equal to minExecutors , and less than or equal to maxExecutors . For a description of each property, see Dynamic Resource Allocation Properties. Start the shuffle service on each worker node in the cluster: In the yarn-site.xml file on each node, add spark_shuffle to yarn.nodemanager.aux-services , and then set yarn.nodemanager.aux-services.spark_shuffle.class to org.apache.spark.network.yarn.YarnShuffleService . Review and, if necessary, edit spark.shuffle.service.* configuration settings. For more information, see the Apache Spark Shuffle Behavior documentation. Restart all NodeManagers in your cluster.
... View more
03-04-2019
10:41 AM
i don't think this helps me because i want to increase the failure limit to more than 20
... View more
02-20-2019
01:26 PM
i think that did what i needed. thanks
... View more
02-20-2019
11:21 AM
@Ilia K Spark can be used to interact with Hive. When you install Spark using Ambari, the hive-site.xml file is automatically populated with the Hive metastore location. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_spark-component-guide/content/spark-config-hive.html
... View more
02-20-2019
12:11 PM
the server node have 32gb ram, and he only accept spark submit jobs (he does not act as client\worker) each worker node is one of two servers types: 16 core 64GB or 48 core 196GB and the workers nodes have only Metrics Monitor / NodeManager installed all the configuration is on default. when running large job i don't mind the minute hold up, but when running short job should be over under 1 minute (for example 500 jobs (each take 30 seconds on one core) should be over under 1 minute when have enough cpu\ram to allocate, i think that the problem is the delay of actual job starting time (i can see the process start on by running top on command line on shell on the worker) 30-60 seconds after the submit is received., i see some java tasks manly regarding the creation on the container
... View more
02-18-2019
11:27 AM
works thanks 🙂
... View more