Support Questions
Find answers, ask questions, and share your expertise

Yarn service check failed.....

Yarn service check failed.....

Contributor

i am in a process of upgrading the cluster hdp 2.4 to 2.6.for this when i was trying to run service check on the yarn it is giving me the below error.

can you please help me out...

stderr: /var/lib/ambari-agent/data/errors-5615.txt

Traceback (most recent call last):

File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 159, in <module>

ServiceCheck().execute()

File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute

method(env)

File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 117, in service_check

user=params.smokeuser,

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 71, in inner

result = function(command, **kwargs)

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 93, in checked_call

tries=tries, try_sleep=try_sleep)

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 141, in _call_wrapper

result = _call(command, **kwargs_copy)

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 294, in _call

raise Fail(err_msg)

resource_management.core.exceptions.Fail: Execution of 'yarn org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls -num_containers 1 -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -timeout 300000 --queue default' returned 1. 18/06/22 11:46:15 INFO impl.TimelineClientImpl: Timeline service address: http://IPaddress:8188/ws/v1/timeline/

18/06/22 11:46:16 INFO distributedshell.Client: Initializing Client

18/06/22 11:46:16 INFO distributedshell.Client: Running Client

18/06/22 11:46:16 INFO client.RMProxy: Connecting to ResourceManager at IPaddress:8050

18/06/22 11:46:18 INFO distributedshell.Client: Got Cluster metric info from ASM, numNodeManagers=3

18/06/22 11:46:18 INFO distributedshell.Client: Got Cluster node info from ASM

18/06/22 11:46:18 INFO distributedshell.Client: Got node report from ASM for, nodeId=IPaddress:45454, nodeAddressIPaddress:8042, nodeRackName/default-rack, nodeNumContainers0

18/06/22 11:46:18 INFO distributedshell.Client: Got node report from ASM for, nodeId=IPaddress:45454, nodeAddressIPaddress:8042, nodeRackName/default-rack, nodeNumContainers1

18/06/22 11:46:18 INFO distributedshell.Client: Got node report from ASM for, nodeId=ipIPaddress:45454, nodeAddressipIPaddress:8042, nodeRackName/default-rack, nodeNumContainers0

18/06/22 11:46:19 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.04761905, queueMaxCapacity=1.0, queueApplicationCount=10000, queueChildQueueCount=0

18/06/22 11:46:19 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS

18/06/22 11:46:19 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE

18/06/22 11:46:19 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS

18/06/22 11:46:19 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE

18/06/22 11:46:19 INFO distributedshell.Client: Max mem capabililty of resources in this cluster 14336

18/06/22 11:46:19 INFO distributedshell.Client: Max virtual cores capabililty of resources in this cluster 3

18/06/22 11:46:19 INFO distributedshell.Client: Copy App Master jar from local filesystem and add to local environment

18/06/22 11:46:19 INFO distributedshell.Client: Set the environment for the application master

18/06/22 11:46:19 INFO distributedshell.Client: Setting up app master command

18/06/22 11:46:19 INFO distributedshell.Client: Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx10m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr

18/06/22 11:46:19 INFO distributedshell.Client: Submitting application to ASM

18/06/22 11:46:19 FATAL distributedshell.Client: Error running Client

org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1519070798024_136600 to YARN : org.apache.hadoop.security.AccessControlException: Queue root.default already has 10000 applications, cannot accept submission of application: application_1519070798024_136600

at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:271)

at org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:708)

at org.apache.hadoop.yarn.applications.distributedshell.Client.main(Client.java:215)

stdout: /var/lib/ambari-agent/data/output-5615.txt

2018-06-22 11:46:10,716 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf

2018-06-22 11:46:10,718 - call['ambari-python-wrap /usr/bin/hdp-select status hadoop-yarn-resourcemanager'] {'timeout': 20}

2018-06-22 11:46:10,772 - call returned (0, 'hadoop-yarn-resourcemanager - 2.4.3.0-227')

2018-06-22 11:46:10,783 - Stack Feature Version Info: stack_version=2.4, version=None, current_cluster_version=2.4.3.0-227 -> 2.4

2018-06-22 11:46:10,846 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf

2018-06-22 11:46:10,857 - HdfsResource['/user/ambari-qa'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://IPaddress:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': [EMPTY], 'user': 'hdfs', 'owner': 'ambari-qa', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0770}

2018-06-22 11:46:10,867 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://IPaddress:50070/webhdfs/v1/user/ambari-qa?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmp96jtLg 2>/tmp/tmp9nMa1h''] {'logoutput': None, 'quiet': False}

2018-06-22 11:46:10,940 - call returned (0, '')

2018-06-22 11:46:10,945 - checked_call['yarn org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls -num_containers 1 -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -timeout 300000 --queue default'] {'path': '/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin', 'user': 'ambari-qa'}

Command failed after 1 tries

2 REPLIES 2

Re: Yarn service check failed.....

Expert Contributor

Try again when you have sufficient resources in queues.

org.apache.hadoop.security.AccessControlException: Queue root.default already has 10000 applications, cannot accept submission of application: application_1519070798024_136600

Re: Yarn service check failed.....

Contributor

i have tried multiple times but again i am getting same response..

please suggest me how to see the details of resources of queue?

in my i am having only one default queue and memory of the cluster is 42GB. and there is no application is running in resource manager ui.

please help me how to resolve this.thanks in advance