Member since
03-17-2016
132
Posts
106
Kudos Received
13
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2498 | 03-28-2019 11:16 AM | |
3118 | 03-28-2019 09:19 AM | |
2569 | 02-02-2017 07:52 AM | |
2696 | 10-03-2016 08:08 PM | |
1143 | 09-13-2016 08:00 PM |
12-19-2019
06:26 AM
Please check if this file exists /var/run/hadoop-yarn/yarn/hadoop-yarn-nodemanager.pid . If not create the directory mkdir /var/run/hadoop-yarn/yarn/ chown -R yarn:hadoop /var/run/hadoop-yarn/yarn/ touch hadoop-yarn-nodemanager.pid chown yarn:hadoop /var/run/hadoop-yarn/yarn/hadoop-yarn-nodemanager.pid This will work.
... View more
04-02-2019
09:06 AM
You need to follow these as those are for spark thrift Configuring Cluster Dynamic Resource Allocation Manually To configure a cluster to run Spark jobs with dynamic resource allocation, complete the following steps: Add the following properties to the spark-defaults.conf file associated with your Spark installation (typically in the $SPARK_HOME/conf directory): Set spark.dynamicAllocation.enabled to true . Set spark.shuffle.service.enabled to true . (Optional) To specify a starting point and range for the number of executors, use the following properties: spark.dynamicAllocation.initialExecutors spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors Note that initialExecutors must be greater than or equal to minExecutors , and less than or equal to maxExecutors . For a description of each property, see Dynamic Resource Allocation Properties. Start the shuffle service on each worker node in the cluster: In the yarn-site.xml file on each node, add spark_shuffle to yarn.nodemanager.aux-services , and then set yarn.nodemanager.aux-services.spark_shuffle.class to org.apache.spark.network.yarn.YarnShuffleService . Review and, if necessary, edit spark.shuffle.service.* configuration settings. For more information, see the Apache Spark Shuffle Behavior documentation. Restart all NodeManagers in your cluster.
... View more
03-29-2019
06:37 AM
Have you followed these steps before adding nodelabels to yarn cluster ? https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_yarn-resource-management/content/configuring_node_labels.html
... View more
03-29-2019
06:18 AM
Follow this https://github.com/ehiggs/spark-terasort
... View more
03-29-2019
04:43 AM
For this you need to use the spark dynamic allocation. Dynamic Allocation (of Executors) (aka Elastic Scaling) is a Spark feature that allows for adding or removing Spark executors dynamically to match the workload. Unlike the "traditional" static allocation where a Spark application reserves CPU and memory resources upfront (irrespective of how much it may eventually use), in dynamic allocation you get as much as needed and no more. It scales the number of executors up and down based on workload, i.e. idle executors are removed, and when there are pending tasks waiting for executors to be launched on, dynamic allocation requests them
... View more
03-29-2019
04:35 AM
In /etc/yum.repos.d, remove all .repo files pointing to the Internet and copy only .repo files from other servers which are already using your local repo. For HDP nodes, initially you need only 2 .repo files, one for the OS, and ambari.repo. When Ambari adds a new node to the cluster it will copy there HDP.repo and HDP-UTILS.repo. Also, have you set your repository URLs in Ambari-> Admin-> Stack and versions-> Versions -> Manage Versions -> [click on your current version] ?
... View more
03-28-2019
11:16 AM
1 Kudo
@Ruslan Fialkovsky You need to write a custom code which can block -skiptrash command This is the path where you need to place the command vi /usr/hdp/current/hadoop-client/bin/hadoop #!/bin/bash export HADOOP_HOME=${HADOOP_HOME:-/usr/hdp/2.6.5.0-292/hadoop} export HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-/usr/hdp/2.6.5.0-292/hadoop-mapreduce} export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-/usr/hdp/2.6.5.0-292/hadoop-yarn} export HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec export HDP_VERSION=${HDP_VERSION:-2.6.5.0-292} export HADOOP_OPTS="${HADOOP_OPTS} -Dhdp.version=${HDP_VERSION}" exec /usr/hdp/2.6.5.0-292//hadoop/bin/hadoop.distro "$@" ###here you need to write code to restrict skip trash
... View more
03-28-2019
10:54 AM
You need to have a look at user limit factor for the queue and min max capacity https://community.hortonworks.com/content/supportkb/49640/what-does-the-user-limit-factor-do-when-used-in-ya.html https://hortonworks.com/blog/yarn-capacity-scheduler/
... View more
03-28-2019
09:19 AM
There's an API to remove older versions from the hosts. Take a look at https://issues.apache.org/jira/browse/AMBARI-18435 E.g., curl 'http://c6401.ambari.apache.org:8080/api/v1/clusters/cl1/requests' -u admin:admin -H "X-Requested-By: ambari" -X POST -d'{"RequestInfo":{"context":"remove_previous_stacks", "action" : "remove_previous_stacks", "parameters" : {"version":"2.5.0.0-1245"}}, "Requests/resource_filters": [{"hosts":"c6403.ambari.apache.org, c6402.ambari.apache.org"}]}'
... View more