Created 10-18-2018 05:22 PM
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/resourcemanager.py", line 275, in <module> Resourcemanager().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/resourcemanager.py", line 158, in start service('resourcemanager', action='start') File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/service.py", line 92, in service Execute(daemon_cmd, user = usr, not_if = check_process) File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run returns=self.resource.returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/3.0.0.0-1634/hadoop/libexec && /usr/hdp/3.0.0.0-1634/hadoop-yarn/bin/yarn --config /usr/hdp/3.0.0.0-1634/hadoop/conf --daemon start resourcemanager' returned 1.
Created 10-24-2018 07:38 AM
From your logs attached, it looks like you have enabled GPU Scheduling. But it is still using the DefaultResourceCalculator.
2018-10-22 17:48:02,490 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: RM uses DefaultResourceCalculator which used only memory as resource-type but invalid resource-types specified {yarn.io/gpu=name: yarn.io/gpu, units: , type: COUNTABLE, value: 0, minimum allocation: 0, maximum allocation: 9223372036854775807, memory-mb=name: memory-mb, units: Mi, type: COUNTABLE, value: 0, minimum allocation: 1024, maximum allocation: 191488, vcores=name: vcores, units: , type: COUNTABLE, value: 0, minimum allocation: 1, maximum allocation: 32}. Use DomainantResourceCalculator instead to make effective use of these resource-types
In YARN->Configs->Advanced->Scheduler , set the following
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
Created 10-18-2018 10:23 PM
As we see that ambari triggered the following command which did not get executed. Hence there might be some additional logging happened inside the "resource manager" logs.
I will suggest you to please check and share the RM logs it should show the cause of command execution failure.
Other option to isolate the issue will be to try restarting the ResourceManager manually using the same command to see if it works?
# su - yarn # ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/3.0.0.0-1634/hadoop/libexec && /usr/hdp/3.0.0.0-1634/hadoop-yarn/bin/yarn --config /usr/hdp/3.0.0.0-1634/hadoop/conf --daemon start resourcemanager
.
If you face any issue then please share the logs.
Created 10-22-2018 12:29 PM
Dear Jay,
Please find the resource manager log file after executed the above the command.
we have changed server name to localhost , but we r using FQN.
Thanks & Regards,
Prashant Gupta
Created 10-24-2018 07:30 AM
Dear Jay,
Any update for above the query.
Thanks & Regards,
Prashant Gupta
Created 10-24-2018 07:38 AM
From your logs attached, it looks like you have enabled GPU Scheduling. But it is still using the DefaultResourceCalculator.
2018-10-22 17:48:02,490 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: RM uses DefaultResourceCalculator which used only memory as resource-type but invalid resource-types specified {yarn.io/gpu=name: yarn.io/gpu, units: , type: COUNTABLE, value: 0, minimum allocation: 0, maximum allocation: 9223372036854775807, memory-mb=name: memory-mb, units: Mi, type: COUNTABLE, value: 0, minimum allocation: 1024, maximum allocation: 191488, vcores=name: vcores, units: , type: COUNTABLE, value: 0, minimum allocation: 1, maximum allocation: 32}. Use DomainantResourceCalculator instead to make effective use of these resource-types
In YARN->Configs->Advanced->Scheduler , set the following
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
Created 10-25-2018 07:28 AM
Created 10-24-2018 10:45 AM
Dear Jay,
After added above the property it's working fine.
Thanks for support.
Regards,
Prashant Gupta
Created 10-24-2018 11:06 AM
@Prashant Gupta Good to know that the ResourceManager started successfully. Kindly mark the answer as accepted if the problem got resolved.
Created 04-10-2019 06:19 PM
Dear Jay,
When i am trying to enable HA in Ambari .then getting below error .
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 348, in <module> NameNode().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 375, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 90, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 175, in namenode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 276, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/rexample.comesource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-HBDCAUTDBN14.cidr.gov.in.out SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Thanks & Regards,
Prashant Gupta