Created on 07-25-2018 11:57 PM - edited 08-17-2019 09:47 PM
I am trying to setup a 2 node cluster with SPARK, HIVE using Ambari Cluster Install Wizard. I had passed first 8 steps and get stuck at "Install, Start and Test" step. Here is the error message from one of the node
I am using Ubuntu 16.04
stderr: <script id="metamorph-23258-start" type="text/x-placeholder"></script>Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 35, in <module> BeforeAnyHook().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 375, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 29, in hook setup_users() File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users fetch_nonlocal_groups = params.fetch_nonlocal_groups, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/accounts.py", line 84, in action_create shell.checked_call(command, sudo=True) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop,user,spark,git,wheel -g hadoop spark' returned 6. usermod: user 'spark' does not exist in /etc/passwd Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-399.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-399.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-INSTALL/scripts/hook.py", line 37, in <module> BeforeInstallHook().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 382, in execute self.save_component_version_to_structured_out(self.command_name) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 244, in save_component_version_to_structured_out stack_select_package_name = stack_select.get_package_name() File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/stack_select.py", line 110, in get_package_name package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/stack_select.py", line 224, in get_packages supported_packages = get_supported_packages() File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/stack_select.py", line 148, in get_supported_packages raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path)) resource_management.core.exceptions.Fail: Unable to query for supported packages using /usr/bin/hdp-select<script id="metamorph-23258-end" type="text/x-placeholder"></script> stdout: <script id="metamorph-23260-start" type="text/x-placeholder"></script>2018-07-25 16:44:16,944 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=None -> 2.6 2018-07-25 16:44:16,948 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2018-07-25 16:44:16,948 - Group['livy'] {} 2018-07-25 16:44:16,949 - Group['spark'] {} 2018-07-25 16:44:16,955 - Group['hdfs'] {} 2018-07-25 16:44:16,955 - Group['hadoop'] {} 2018-07-25 16:44:16,956 - Group['users'] {} 2018-07-25 16:44:16,956 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-07-25 16:44:16,957 - Modifying user hive 2018-07-25 16:44:16,973 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-07-25 16:44:16,975 - Modifying user livy 2018-07-25 16:44:16,989 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-07-25 16:44:16,991 - Modifying user zookeeper 2018-07-25 16:44:17,006 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-07-25 16:44:17,008 - Modifying user spark Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-399.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-399.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', ''] Command failed after 1 tries <script id="metamorph-23260-end" type="text/x-placeholder"></script>
This is the summary configuration reported in Review step and screen output for "Deploy" step
Cluster Name : HW2N Total Hosts : 2 (2 new) Repositories: ubuntu16 (HDP-2.6): http://public-repo-1.hortonworks.com/HDP/ubuntu16/2.x/updates/2.6.5.0 ubuntu16 (HDP-2.6-GPL): http://public-repo-1.hortonworks.com/HDP-GPL/ubuntu16/2.x/updates/2.6.5.0 ubuntu16 (HDP-UTILS-1.1.0.22): http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/ubuntu16 Services: HDFS DataNode : 1 host NameNode : msl-dpe-perf77.msl.lab NFSGateway : 0 host SNameNode : msl-dpe-perf77.msl.lab YARN + MapReduce2 App Timeline Server : msl-dpe-perf77.msl.lab NodeManager : 1 host ResourceManager : msl-dpe-perf77.msl.lab Tez Clients : 1 host Hive Metastore : msl-dpe-perf77.msl.lab HiveServer2 : msl-dpe-perf77.msl.lab WebHCat Server : msl-dpe-perf77.msl.lab Database : New MySQL Database HBase Master : msl-dpe-perf77.msl.lab RegionServer : 1 host Phoenix Query Server : 0 host Pig Clients : 1 host ZooKeeper Server : msl-dpe-perf77.msl.lab Ambari Metrics Metrics Collector : msl-dpe-perf77.msl.lab Grafana : msl-dpe-perf77.msl.lab SmartSense Activity Analyzer : msl-dpe-perf77.msl.lab Activity Explorer : msl-dpe-perf77.msl.lab HST Server : msl-dpe-perf77.msl.lab Spark Livy Server : 0 host History Server : msl-dpe-perf77.msl.lab Thrift Server : 0 host Spark2 Livy for Spark2 Server : 0 host History Server : msl-dpe-perf77.msl.lab Thrift Server : 0 host Slider Clients : 1 host
Created 07-26-2018 06:53 PM
resource_management import error was caused by ambari wizard using /usr/lib/python2.6/site-packages. For Ubuntu 16, python 2.7 does not have this directory on path. It can be resolved by adding
PYTHONPATH=/usr/lib/python2.6/site-packages
Created 07-26-2018 05:07 AM
Created 07-26-2018 07:04 AM
hi @Harry Li ,
Its looks like the installation is failing at step :
2018-07-2516:44:17,008-Modifying user spark
Can you investigate whether its any user creation related issue or due to some extraneous entries of users group from /etc/group
You can get the full error log from /var/lib/ambari-agent/data/error-399.json and /var/lib/ambari-agent/data/output-399.json
in the node : msl-dpe-perf74.msl.lab .
Created 07-26-2018 04:41 PM
Thanks Adi and Akhil
A closer look of the issue seems pointing to failed run of hook.py, here is the message
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-399.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-399.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']Traceback (most recent call last):
I tested this script manually and here is what I got
harry.li@msl-dpe-perf74:/usr/lib/python2.6/site-packages$ sudo /usr/bin/python '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py' Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 20, in <module> from resource_management import * ImportError: No module named resource_management
I verified that my python 2.7.12 is installed correctly and resource_management directory has been installed correctly too. Is there a setting in Ambari to control python import path?
root@msl-dpe-perf74:/usr/lib/python2.6/site-packages# ls -l /usr/lib/ambari-agent/lib total 20 drwxr-xr-x 3 root root 4096 Jul 25 17:54 ambari_commons drwxr-xr-x 3 root root 4096 Jul 24 17:29 ambari_jinja2 drwxr-xr-x 2 root root 4096 Jul 24 17:29 ambari_simplejson drwxr-xr-x 2 root root 4096 Jul 24 17:29 examples drwxr-xr-x 4 root root 4096 Jul 24 17:29 resource_management root@msl-dpe-perf74:/usr/lib/python2.6/site-packages# ls -l /usr/lib/ambari-agent/lib/resource_management/ total 16 drwxr-xr-x 5 root root 4096 Jul 24 17:29 core -rwxrwxrwx 1 root root 887 Feb 23 11:10 __init__.py -rw-r--r-- 1 root root 1049 Jul 24 17:29 __init__.pyc drwxr-xr-x 6 root root 4096 Jul 24 17:29 libraries
Created 07-26-2018 06:53 PM
resource_management import error was caused by ambari wizard using /usr/lib/python2.6/site-packages. For Ubuntu 16, python 2.7 does not have this directory on path. It can be resolved by adding
PYTHONPATH=/usr/lib/python2.6/site-packages
Created 07-26-2018 07:01 PM
great . please accept your answer as best answer and close this threaad .