About chennuri_gouris

chennuri_gouris · ‎12-19-2019

@saivenkatg55Can you please share the hdfs-site.xml

chennuri_gouris · ‎12-19-2019

Please check if this file exists /var/run/hadoop-yarn/yarn/hadoop-yarn-nodemanager.pid . If not create the directory mkdir /var/run/hadoop-yarn/yarn/ chown -R yarn:hadoop /var/run/hadoop-yarn/yarn/ touch hadoop-yarn-nodemanager.pid chown yarn:hadoop /var/run/hadoop-yarn/yarn/hadoop-yarn-nodemanager.pid This will work.

chennuri_gouris · ‎04-02-2019

You need to follow these as those are for spark thrift Configuring Cluster Dynamic Resource Allocation Manually To configure a cluster to run Spark jobs with dynamic resource allocation, complete the following steps: Add the following properties to the spark-defaults.conf file associated with your Spark installation (typically in the $SPARK_HOME/conf directory): Set spark.dynamicAllocation.enabled to true . Set spark.shuffle.service.enabled to true . (Optional) To specify a starting point and range for the number of executors, use the following properties: spark.dynamicAllocation.initialExecutors spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors Note that initialExecutors must be greater than or equal to minExecutors , and less than or equal to maxExecutors . For a description of each property, see Dynamic Resource Allocation Properties. Start the shuffle service on each worker node in the cluster: In the yarn-site.xml file on each node, add spark_shuffle to yarn.nodemanager.aux-services , and then set yarn.nodemanager.aux-services.spark_shuffle.class to org.apache.spark.network.yarn.YarnShuffleService . Review and, if necessary, edit spark.shuffle.service.* configuration settings. For more information, see the Apache Spark Shuffle Behavior documentation. Restart all NodeManagers in your cluster.

chennuri_gouris · ‎03-29-2019

Have you followed these steps before adding nodelabels to yarn cluster ? https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_yarn-resource-management/content/configuring_node_labels.html

chennuri_gouris · ‎03-29-2019

Follow this https://github.com/ehiggs/spark-terasort

chennuri_gouris · ‎03-29-2019

For this you need to use the spark dynamic allocation. Dynamic Allocation (of Executors) (aka Elastic Scaling) is a Spark feature that allows for adding or removing Spark executors dynamically to match the workload. Unlike the "traditional" static allocation where a Spark application reserves CPU and memory resources upfront (irrespective of how much it may eventually use), in dynamic allocation you get as much as needed and no more. It scales the number of executors up and down based on workload, i.e. idle executors are removed, and when there are pending tasks waiting for executors to be launched on, dynamic allocation requests them

chennuri_gouris · ‎03-29-2019

In /etc/yum.repos.d, remove all .repo files pointing to the Internet and copy only .repo files from other servers which are already using your local repo. For HDP nodes, initially you need only 2 .repo files, one for the OS, and ambari.repo. When Ambari adds a new node to the cluster it will copy there HDP.repo and HDP-UTILS.repo. Also, have you set your repository URLs in Ambari-> Admin-> Stack and versions-> Versions -> Manage Versions -> [click on your current version] ?

chennuri_gouris · ‎03-28-2019

@Ruslan Fialkovsky You need to write a custom code which can block -skiptrash command This is the path where you need to place the command vi /usr/hdp/current/hadoop-client/bin/hadoop #!/bin/bash export HADOOP_HOME=${HADOOP_HOME:-/usr/hdp/2.6.5.0-292/hadoop} export HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-/usr/hdp/2.6.5.0-292/hadoop-mapreduce} export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-/usr/hdp/2.6.5.0-292/hadoop-yarn} export HADOOP_LIBEXEC_DIR=${HADOOP_HOME}/libexec export HDP_VERSION=${HDP_VERSION:-2.6.5.0-292} export HADOOP_OPTS="${HADOOP_OPTS} -Dhdp.version=${HDP_VERSION}" exec /usr/hdp/2.6.5.0-292//hadoop/bin/hadoop.distro "$@" ###here you need to write code to restrict skip trash

chennuri_gouris · ‎03-28-2019

You need to have a look at user limit factor for the queue and min max capacity https://community.hortonworks.com/content/supportkb/49640/what-does-the-user-limit-factor-do-when-used-in-ya.html https://hortonworks.com/blog/yarn-capacity-scheduler/

chennuri_gouris · ‎03-28-2019

There's an API to remove older versions from the hosts. Take a look at https://issues.apache.org/jira/browse/AMBARI-18435 E.g., curl 'http://c6401.ambari.apache.org:8080/api/v1/clusters/cl1/requests' -u admin:admin -H "X-Requested-By: ambari" -X POST -d'{"RequestInfo":{"context":"remove_previous_stacks", "action" : "remove_previous_stacks", "parameters" : {"version":"2.5.0.0-1245"}}, "Requests/resource_filters": [{"hosts":"c6403.ambari.apache.org, c6402.ambari.apache.org"}]}'

Online	Offline
Last Visited	‎11-05-2024 03:31 AM

Member Since	‎03-17-2016 06:01 AM
Last Visited	‎11-05-2024 03:31 AM
Posts	132
Kudos received	106

Cloudera Community

Re: deny rm with skiptrash

Re: I have upgraded HDP to 2.6.4.0, but can't dele...

Re: restarting ambari server creating a file ambar...

Re: transfer files from local to hdfs in AWS insta...

Re: Does Atlas automatically failover to secondary...

Re: Unable to start data node

Re: Unable to start the node manager

Re: completed containers does not freed - Yarn re...

Re: yarn.exceptions.YarnRuntimeException: org.apac...

Re: spark teragen and terasort with commands and j...

Re: completed containers does not freed - Yarn re...

Re: I have upgraded HDP to 2.6.4.0, but can't dele...

Re: deny rm with skiptrash

Re: completed containers does not freed - Yarn re...

Re: I have upgraded HDP to 2.6.4.0, but can't dele...