Created on 11-25-2019 03:31 PM - last edited on 11-25-2019 08:13 PM by ask_bill_brooks
Attempting to add a client node to cluster via Ambari (v2.7.3.0) (HDP 3.1.0.0-78) and seeing odd error
stderr: Traceback (most recent call last😞 File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module> BeforeAnyHook().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute method(env) File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook setup_users() File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users fetch_nonlocal_groups = params.fetch_nonlocal_groups, File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create shell.checked_call(command, sudo=True) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call raise ExecutionFailed(err_msg, code, out, err)resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwdError: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failedTraceback (most recent call last😞 File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute self.save_component_version_to_structured_out(self.command_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out stack_select_package_name = stack_select.get_package_name() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages supported_packages = get_supported_packages() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path)) Fail: Unable to query for supported packages using /usr/bin/hdp-select stdout: 2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1 2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf2019-11-25 13:07:57,652 - Group['livy'] {} 2019-11-25 13:07:57,654 - Group['spark'] {} 2019-11-25 13:07:57,654 - Group['ranger'] {} 2019-11-25 13:07:57,654 - Group['hdfs'] {} 2019-11-25 13:07:57,654 - Group['zeppelin'] {} 2019-11-25 13:07:57,655 - Group['hadoop'] {} 2019-11-25 13:07:57,655 - Group['users'] {} 2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2019-11-25 13:07:57,659 - Modifying user hiveError: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] 2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed2019-11-25 13:07:58,000 - Reporting component version failedTraceback (most recent call last😞 File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute self.save_component_version_to_structured_out(self.command_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out stack_select_package_name = stack_select.get_package_name() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages supported_packages = get_supported_packages() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path)) Fail: Unable to query for supported packages using /usr/bin/hdp-select Command failed after 1 tries
The problem appears to be
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
caused by
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
Though, when running
[root@HW001 .ssh]# /usr/bin/hdp-select versions3.1.0.0-78
from the ambari server node, I can see the command runs.
Looking at what the hook script is trying to run/access, I see
[root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
-rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
[root@client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json -rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json [root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY total 0drwxr-xr-x 4 root root 34 Nov 25 10:51 .drwxr-xr-x 8 root root 147 Nov 25 10:51 ..drwxr-xr-x 2 root root 34 Nov 25 10:51 files drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts
[root@client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json
ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory
[root@client001~]# ls -lha /var/lib/ambari-agent/tmp total 96Kdrwxrwxrwt 3 root root 4.0K Nov 25 13:06
.drwxr-xr-x 10 root root 267 Nov 25 10:50 .
.drwxr-xr-x 6 root root 4.0K Nov 25 13:06 ambari_commons
-rwx------ 1 root root 1.4K Nov 25 13:06 ambari-sudo.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 create-python-wrap.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py
-rwxr-xr-x 1 root root 16K Nov 25 10:50 setupAgent1574715020.py
-rwxr-xr-x 1 root root 16K Nov 25 11:12 setupAgent1574716361.py
-rwxr-xr-x 1 root root 16K Nov 25 11:29 setupAgent1574717392.py
-rwxr-xr-x 1 root root 16K Nov 25 13:06 setupAgent1574723163.py
notice there is ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory. Not sure if this is normal, though.
Anyone know what could be causing this or any debugging hints from this point?
Created 11-26-2019 01:21 PM
After just giving in and trying to manually create the hive user myself, I see
[root@airflowetl ~]# useradd -g hadoop -s /bin/bash hive useradd: user 'hive' already exists
[root@airflowetl ~]# cat /etc/passwd | grep hive
[root@airflowetl ~]# id hive uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users)
The fact that this existing user's uid looks like this and is not in the /etc/passwd file made me think that there is some existing Active Directory user (which this client node syncs with via installed SSSD) that already has the name hive. Checking our AD users, this turned out to be true.
Temporarily stopping the SSSD service to stop sync with AD (service sssd stop) (since, not sure if you can get a server to ignore AD syncs on an individual user basis) before rerunning the client host add in Ambari fixed the problem for me.
Created 10-03-2021 01:14 AM
I was facing the similar error and got it resolved by added Hadoop users to passwd file.
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-59009.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-59009.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
>> File location /etc/passwd
>> Adduser hadoop
Created 11-26-2019 12:14 PM
Adding some log printing lines near the offending final line in the error trace, ie. File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages, I print the code and stdout:
2
ambari-python-wrap: can't open file '/usr/bin/hdp-select': [Errno 2] No such file or directory
So what the heck? It wants hdp-select to already be there, but ambari add-host UI complains if I manually install that binary myself beforehand. When I do manually install it (using the same repo file as in the rest of the existing cluster nodes) all I see is...
0 Packages: accumulo-client accumulo-gc accumulo-master accumulo-monitor accumulo-tablet accumulo-tracer atlas-client atlas-server beacon beacon-client beacon-server druid-broker druid-coordinator druid-historical druid-middlemanager druid-overlord druid-router druid-superset falcon-client falcon-server flume-server hadoop-client hadoop-hdfs-client hadoop-hdfs-datanode hadoop-hdfs-journalnode hadoop-hdfs-namenode hadoop-hdfs-nfs3 hadoop-hdfs-portmap hadoop-hdfs-secondarynamenode hadoop-hdfs-zkfc hadoop-httpfs hadoop-mapreduce-client hadoop-mapreduce-historyserver hadoop-yarn-client hadoop-yarn-nodemanager hadoop-yarn-registrydns hadoop-yarn-resourcemanager hadoop-yarn-timelinereader hadoop-yarn-timelineserver hbase-client hbase-master hbase-regionserver hive-client hive-metastore hive-server2 hive-server2-hive hive-server2-hive2 hive-webhcat hive_warehouse_connector kafka-broker knox-server livy-client livy-server livy2-client livy2-server mahout-client oozie-client oozie-server phoenix-client phoenix-server pig-client ranger-admin ranger-kms ranger-tagsync ranger-usersync shc slider-client spark-atlas-connector spark-client spark-historyserver spark-schema-registry spark-thriftserver spark2-client spark2-historyserver spark2-thriftserver spark_llap sqoop-client sqoop-server storm-client storm-nimbus storm-slider-client storm-supervisor superset tez-client zeppelin-server zookeeper-client zookeeper-serverAliases: accumulo-server all client hadoop-hdfs-server hadoop-mapreduce-server hadoop-yarn-server hive-server Command failed after 1 tries
Created 11-26-2019 01:21 PM
After just giving in and trying to manually create the hive user myself, I see
[root@airflowetl ~]# useradd -g hadoop -s /bin/bash hive useradd: user 'hive' already exists
[root@airflowetl ~]# cat /etc/passwd | grep hive
[root@airflowetl ~]# id hive uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users)
The fact that this existing user's uid looks like this and is not in the /etc/passwd file made me think that there is some existing Active Directory user (which this client node syncs with via installed SSSD) that already has the name hive. Checking our AD users, this turned out to be true.
Temporarily stopping the SSSD service to stop sync with AD (service sssd stop) (since, not sure if you can get a server to ignore AD syncs on an individual user basis) before rerunning the client host add in Ambari fixed the problem for me.
Created 10-03-2021 01:14 AM
I was facing the similar error and got it resolved by added Hadoop users to passwd file.
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-59009.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-59009.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
>> File location /etc/passwd
>> Adduser hadoop