About gsaha

gsaha · ‎11-10-2018

That's because this keytab is used by YARN Service master which needs a service principal and not a user principal. Its all towards thwarting replay attacks.

gsaha · ‎11-10-2018

Do you mean will we support principal of the format "user@EXAMPLE.COM"?

gsaha · ‎11-10-2018

You can upload the keytab from any one host to hdfs and then set "keytab" value to that path, something like "hdfs:///user/user1/user1.keytab". Note, the principal_name in that case cannot contain _HOST anymore and has to be expanded to the hostname from where you chose to upload the keytab, so something like "user1/host1.example.com@EXAMPLE.COM".

gsaha · ‎04-10-2018

What kind of apps are you running? Can you check under the {application_id}/{container_id} directories under yarn.nodemanager.log-dirs if the apps are creating any logs? Are the apps creating a sub-directory under the container dir and then logging under that? Note, logs created under sub-directories are not aggregated.

gsaha · ‎04-07-2018

Can you check the NM log in the host where at least one container of your job ran to see if you find any errors related to log-aggregation?

gsaha · ‎04-05-2018

Can you change ownership of /tmp in HDFS to yarn:hadoop instead of hdfs:hadoop? Is it a secure cluster?

gsaha · ‎03-01-2018

You need to use a new tag, not one of the existing ones. Typically it is the "version" keyword followed by current timestamp. If you don't absolutely need to use the REST API or you don't want to deal with the version tag, you should use configs.sh/configs.py (the sh is not supported in some older Ambari versions). Sample get call is - /var/lib/ambari-server/resources/scripts/configs.py -a get -l <ambari_server_host> -n <cluster_name> -c capacity-scheduler -f /tmp/cs.json Sample output in /tmp/cs.json is - { "properties": { "yarn.scheduler.capacity.maximum-am-resource-percent": "0.4", "yarn.scheduler.capacity.maximum-applications": "10000", "yarn.scheduler.capacity.node-locality-delay": "40", "yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator", "yarn.scheduler.capacity.queue-mappings-override.enable": "false", "yarn.scheduler.capacity.root.acl_administer_queue": "*", "yarn.scheduler.capacity.root.capacity": "100", "yarn.scheduler.capacity.root.queues": "Hive", "yarn.scheduler.capacity.root.accessible-node-labels": "*", "yarn.scheduler.capacity.root.Hive.acl_submit_applications": "*", "yarn.scheduler.capacity.root.Hive.maximum-capacity": "100", "yarn.scheduler.capacity.root.Hive.user-limit-factor": "4", "yarn.scheduler.capacity.root.Hive.state": "RUNNING", "yarn.scheduler.capacity.root.Hive.capacity": "100" } } Help is - /var/lib/ambari-server/resources/scripts/configs.py -h To perform the change you want to make, edit the file /tmp/cs.json with your desired change (update value of yarn.scheduler.capacity.root.Hive.user-limit-factor in your case) then use the "-a set" option with the same file. Sample cmd provided below - /var/lib/ambari-server/resources/scripts/configs.py -a set -l <ambari_server_host> -n <cluster_name>-c capacity-scheduler -f /tmp/cs1.json Note, you need to refresh queues to make this change take effect. You can do it by running rmadmin via command line - yarn rmadmin -refreshQueues Or, use the Ambari REST API - curl -u admin:admin -H 'Content-Type:application/json' -H 'X-Requested-By:ambari' -iX PUT -d '{"save": "true"}' http://<ambari-server>:8080/api/v1/views/CAPACITY-SCHEDULER/versions/1.0.0/instances/<view_instance_name>/resources/scheduler/configuration/saveAndRefresh You do not need to restart RM for capacity scheduler changes. However, if you make changes to other configs like yarn-site via configs.py you need to restart RM. You can do so by using Ambari REST APIs as shown below. Stop RM - curl -u admin:admin -H "X-Requested-By:ambari" -iX PUT -d '{"ServiceComponentInfo":{"state":"INSTALLED"}}' http://<ambari-server>:8080/api/v1/clusters/<cluster-name>/services/YARN/components/RESOURCEMANAGER Start RM - curl -u admin:admin -H "X-Requested-By:ambari" -iX PUT -d '{"ServiceComponentInfo":{"state":"STARTED"}}' http://<ambari-server>:8080/api/v1/clusters/<cluster-name>/services/YARN/components/RESOURCEMANAGER

gsaha · ‎02-28-2018

Note, you are hitting an Ambari REST API and not RM REST API.

gsaha · ‎02-28-2018

@Veerendra Nath Jasthi, is your Ambari Server running on localhost?

gsaha · ‎01-31-2018

In the RM UI, you can click on the app id link for a spark job and follow the app-attempt link and then click on the logs link against the first container (typically the one ending with 0001). Check the AM logs there and see what you find.

Online	Offline
Last Visited	‎12-13-2018 09:45 PM

Member Since	‎09-29-2015 06:00 AM
Last Visited	‎12-13-2018 09:45 PM
Posts	51
Kudos received	7

Cloudera Community

Re: Dockerized YARN services with Kerberos

Re: YARN logs for applicationId - for specific per...

Re: Overwrite yarn, mapred, tez and hive configura...

Re: Beeline query not getting launched - Queue's ...

Re: script to kill application after 20 minutes

Re: Dockerized YARN services with Kerberos

Re: Dockerized YARN services with Kerberos

Re: Dockerized YARN services with Kerberos

Re: Logs disappearing after Yarn Log Aggregation

Re: Logs disappearing after Yarn Log Aggregation

Re: Logs disappearing after Yarn Log Aggregation

Re: YARN ResourceManager REST API throwing erro?

Re: YARN ResourceManager REST API throwing erro?

Re: YARN ResourceManager REST API throwing erro?

Re: SPARK job taking more memory then it is given