Created 01-16-2018 07:06 AM
What is the correct way of configuring a custom repo (say using jfrog) which is mirroring the public repo so that a blueprint install can find packages ?
for instance my /etc/yum.repos.d/ambari-hdp-1.repo is
[HDP-2.6-repo-1] name=HDP-2.6-repo-1 baseurl=http://myserver:8081/artifactory/hortonworks-hdp/ path=/ enabled=1 gpgcheck=0 [HDP-UTILS-1.1.0.21-repo-2] name=HDP-UTILS-1.1.0.21-repo-2 baseurl=http://myserver:8081/artifactory/hortonworks-hdp-utils/ path=/ enabled=1 gpgcheck=0
So if i run sudo yum info hadoop-hdfs-datanode it lists the right package.
As part of the 2.6 change in blueprints it is required to register the stack version following the methodology described here by registering a vdf file
Once done, I can see my repository_version with id 1 on
/api/v1/stacks/HDP/versions/2.6/repository_versions/1/operating_systems/redhat6/repositories/HDP-2.6 { "href" : "https://fooserver:8443/api/v1/stacks/HDP/versions/2.6/repository_versions/1/operating_systems/redhat6/repositories/HDP-2.6", "Repositories" : { "applicable_services" : [ ], "base_url" : "http://myserver:8081/artifactory/hortonworks-hdp/", "components" : null, "default_base_url" : "", "distribution" : null, "latest_base_url" : "", "mirrors_list" : "", "os_type" : "redhat6", "repo_id" : "HDP-2.6", "repo_name" : "HDP", "repository_version_id" : 1, "stack_name" : "HDP", "stack_version" : "2.6", "unique" : false } }
However after registering the cluster template with "repository_version_id" : 1 the cluster initiation request fails to find the repos on any of the nodes. Where exactly do I specify the version so that the blueprint install can pick it up ??
Have also attempted to set the repo version in /var/lib/ambari-server/resources/stacks/HDP/2.6/repos/repoinfo.xml but to no avail.
2018-01-16 06:35:50,800 - Unable to load available packages Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 771, in load_available_packages self.available_packages_in_repos = pkg_provider.get_available_packages_in_repos(repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 85, in get_available_packages_in_repos available_packages.extend(self._get_available_packages(repo)) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 146, in _get_available_packages return self._lookup_packages(cmd, 'Available Packages') File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 191, in _lookup_packages if items[i + 2].find('@') == 0: IndexError: list index out of range 2018-01-16 06:35:51,308 - The 'hadoop-hdfs-datanode' component did not advertise a version. This may indicate a problem with the component packaging. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 155, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 367, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 48, in install import params File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/params.py", line 25, in <module> from params_linux import * File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/params_linux.py", line 391, in <module> lzo_packages = get_lzo_packages(stack_version_unformatted) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_lzo_packages.py", line 45, in get_lzo_packages lzo_packages += [script_instance.format_package_name("hadooplzo_${stack_version}"), File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 538, in format_package_name raise Fail("Cannot match package for regexp name {0}. Available packages: {1}".format(name, self.available_packages_in_repos)) resource_management.core.exceptions.Fail: Cannot match package for regexp name hadooplzo_${stack_version}. Available packages: [] 2018-01-16 06:36:04,670 - Unable to load available packages Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 771, in load_available_packages self.available_packages_in_repos = pkg_provider.get_available_packages_in_repos(repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 85, in get_available_packages_in_repos available_packages.extend(self._get_available_packages(repo)) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 146, in _get_available_packages return self._lookup_packages(cmd, 'Available Packages') File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 191, in _lookup_packages if items[i + 2].find('@') == 0: IndexError: list index out of range 2018-01-16 06:36:05,184 - The 'hadoop-hdfs-datanode' component did not advertise a version. This may indicate a problem with the component packaging.
Created 01-31-2018 07:27 AM
The responses above helped me with the problems i had, however the right answer is that when using blueprints in version 2.6 onwards when the vdf file is registered, we have to specify the repositories in that file. That input is then used to create another ambari-hdp-repo-1.repo which will then be subsequently used.
Created 01-16-2018 08:19 AM
This is a bug (https://hortonworks.jira.com/browse/EAR-7270) fixed in ambari 2.6.0.0. For now you can do a workaround by changing a file
File: /usr/lib/ambari-agent/lib/resource_management/libraries/functions/get_lzo_packages.py Change from- else: lzo_packages += [script_instance.format_package_name("hadooplzo_${stack_version}"), script_instance.format_package_name("hadooplzo_${stack_version}-native")] To else: lzo_packages += [script_instance.format_package_name("hadooplzo_2_6_2_2035"), script_instance.format_package_name("hadooplzo_2_6_2_2035-native")]
I have hardcoded to "hadooplzo_2_6_2_2035", but you can change it to the version installed in your system.
Thanks,
Aditya
Created 01-16-2018 09:19 PM
Thanks, I assume you mean that I would need to change that for ALL components. In the example above the datanode install required the hadooplzo compression libararies but the version available for both of them would be different.
I guess the above jira is private is this fix present in Ambari 2.6.0.0 ? Nevertheless changing the version for each component would be quite tedious to get right even after it is automated to be fixed on ALL the nodes for ALL the components.
Is there a better fix for this ? so that the stack_version is picked up as from the registered version ? I am unable to install the cluster using blueprints or the ui with this bug in place .
Created 01-17-2018 05:07 AM
I guess you need not change it for all the components. Changing lzo would be fine. Change it in all the nodes in the above mentioned folder
Created 01-17-2018 06:42 AM
I had tried that but it did not work at the core the issue was somewhat related to this question which does not have an official answer.
Instead, I have moved on to ambari 2.6.1 and the HDP 2.6.4 stack. The cluster installed successfully but did not start because the METRICS_MONITOR service could not start and that problem looks similar to this. sigh !
Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 27, in <module> from core.controller import Controller File "/usr/lib/python2.6/site-packages/resource_monitoring/core/controller.py", line 27, in <module> from metric_collector import MetricsCollector File "/usr/lib/python2.6/site-packages/resource_monitoring/core/metric_collector.py", line 23, in <module> from host_info import HostInfo File "/usr/lib/python2.6/site-packages/resource_monitoring/core/host_info.py", line 22, in <module> import psutil File "/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build/lib.linux-x86_64-2.7/psutil/__init__.py", line 89, in <module> import psutil._pslinux as _psplatform File "/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build/lib.linux-x86_64-2.7/psutil/_pslinux.py", line 20, in <module> from psutil import _common ImportError: cannot import name _common psutil binaries need to be built by running, psutil/build.py
Created 01-17-2018 06:41 PM
@Aditya Sirna So installing the cluster from the public repositories did work, however when switching back to my local repos it seems that the oozie tomcat is being picked up from the bigtop repository rather than the HDP repository specified in the VDF.
My internal repository is set to point to http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.4.0 for hdp components. and a yum info oozie_2_6_4_0_91 shows that it is available from the HDP-2.6-repo-1 repo in pointing to my internal repo...
2018-01-17 18:15:24,397 - The 'oozie-client' component did not advertise a version. This may indicate a problem with the component packaging. However, the stack-select tool was able to report a single version installed (2.6.4.0-91). This is the version that will be reported. 2018-01-17 18:16:28,013 - The 'oozie-client' component did not advertise a version. This may indicate a problem with the component packaging. However, the stack-select tool was able to report a single version installed (2.6.4.0-91). This is the version that will be reported. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_client.py", line 71, in <module> OozieClient().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 375, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_client.py", line 33, in install self.install_packages(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 821, in install_packages retry_count=agent_stack_retry_count) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 53, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 264, in install_package self.checked_call_with_retries(cmd, sudo=True, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 266, in checked_call_with_retries return self._call_with_retries(cmd, is_checked=True, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 283, in _call_with_retries code, out = func(cmd, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/yum -d 0 -e 0 -y install oozie_2_6_4_0_91' returned 1. Error Downloading Packages: bigtop-tomcat-6.0.48-1.noarch: failure: bigtop-tomcat/bigtop-tomcat-6.0.48-1.noarch.rpm from HDP-2.6-repo-1: [Errno 256] No more mirrors to try.
Created 01-17-2018 09:37 PM
resolved the above oozie problem ; the problem was a corrupt bigtop-tomcat in our repository
Created 01-17-2018 07:07 AM
Sometimes it happens that the "psutil" binaries gets corrupted due to some OS binary changes (like Python version change) for the build folder So we will need to build it again. Please try this:
1. Stop ambari metrics monitor.
2. Build the psutil as following:
# cd /usr/lib/python2.6/site-packages/resource_monitoring/ # python psutil/build.py
.
3. Then restart ambari metrics monitor.
.
If it does not work then please check what is your python version?
# python --version
Created 01-17-2018 06:35 PM
Well that did work. However as I am doing a blueprint install what this means is that as soon as the metrics collector is installed I would need to run this script. Is there a hook in the python scripts where I can inject this command to be run so that I dont have to wait for the services to fail to start then ansibilize and run this command on all the nodes and then issue a restart command ?
Created 01-17-2018 09:51 PM
Thanks for your inputs @Jay Kumar SenSharma and @Aditya Sirna but for me updating to the latest version 2.6.1 was a possibility and that worked.
As feedback the blueprint installation will work but in case of failures like starting services for instance the metrics monitor because of python dependencies can there be a hook in the process or really a part of the installation itself which does that ?
Otherwise we are basically left with a cluster which is installed but cannot be started.
If certain components fail to install (say on a node) and as a consequence the subsequent packages then would it be possible to restart the cluster provisioning request from that point ?
Is it possible to have a heirarchy of component installation and start ? i.e is it really necessary to install the metrics monitor before and start it before the core services ?