Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot add node(s) to HDP 2.6.1 cluster through Cloudbreak

avatar

I'm using HDP 2.6.1.0-129 cluster with Cloudbreak Deployer 2.4.2. It had worked well until Dec 7th, 2018. Since then, however, it always fails to add node(s) to the cluster using the Cloudbreak Deployer.

Following is the relevant error message in cbreak.log of the Cloudbreak Deployer:

cloudbreak_1   | 2018-12-13 06:12:11,102 [reactorDispatcher-34] buildLogContextForReactorHandler:69 INFO  c.s.c.l.LogContextAspects - [owner:9e997395-c6d1-498d-bfa2-1a0f508c7b21] [type:CLUSTER] [id:2] [name:nakagawa-test-2] [flow:ec31867c-978c-41ef-8796-896ecabd98ba] [tracking:] A Reactor event handler's 'accept' method has been intercepted: execution(Flow2Handler.accept(..)), MDC logger context is built.
cloudbreak_1   | 2018-12-13 06:12:11,116 [reactorDispatcher-34] execute:87 INFO  c.s.c.c.f.AbstractAction - [owner:9e997395-c6d1-498d-bfa2-1a0f508c7b21] [type:STACKVIEW] [id:2] [name:nakagawa-test-2] [flow:ec31867c-978c-41ef-8796-896ecabd98ba] [tracking:] Stack: 2, flow state: UPSCALING_AMBARI_STATE, phase: service, execution time 1024 sec
cloudbreak_1   | 2018-12-13 06:12:11,117 [reactorDispatcher-34] clusterUpscaleFailed:88 ERROR c.s.c.c.f.c.u.ClusterUpscaleFlowService - [owner:9e997395-c6d1-498d-bfa2-1a0f508c7b21] [type:STACKVIEW] [id:2] [name:nakagawa-test-2] [flow:ec31867c-978c-41ef-8796-896ecabd98ba] [tracking:] Error during Cluster upscale flow: com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorFailedException: Failed: Orchestrator component went failed in 7.500000 mins, message: There are missing nodes from job (jid: 20181213061135612084), target: [ip-10-0-1-203.ap-northeast-1.compute.internal, ip-10-0-1-68.ap-northeast-1.compute.internal]
cloudbreak_1   | Node: ip-10-0-1-203.ap-northeast-1.compute.internal Error(s): An exception occurred in this state: Traceback (most recent call last):
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1843, in call
cloudbreak_1   |     **cdata['kwargs'])
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1795, in wrapper
cloudbreak_1   |     return f(*args, **kwargs)
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/states/pkg.py", line 1631, in installed
cloudbreak_1   |     **kwargs)
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/modules/yumpkg.py", line 1415, in install
cloudbreak_1   |     if re.match('kernel(-.+)?', name):
cloudbreak_1   |   File "/usr/lib64/python2.7/re.py", line 141, in match
cloudbreak_1   |     return _compile(pattern, flags).match(string)
cloudbreak_1   | TypeError: expected string or buffer
cloudbreak_1   |  | An exception occurred in this state: Traceback (most recent call last):
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1843, in call
cloudbreak_1   |     **cdata['kwargs'])
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1795, in wrapper
cloudbreak_1   |     return f(*args, **kwargs)
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/states/pkg.py", line 1631, in installed
cloudbreak_1   |     **kwargs)
cloudbreak_1   |   File "/usr/lib/python2.7/dist-packages/salt/modules/yumpkg.py", line 1415, in install
cloudbreak_1   |     if re.match('kernel(-.+)?', name):
cloudbreak_1   |   File "/usr/lib64/python2.7/re.py", line 141, in match
cloudbreak_1   |     return _compile(pattern, flags).match(string)
cloudbreak_1   | TypeError: expected string or buffe

I also found the relevant error message in /var/log/salt/minion of the node being tried to add:

2018-12-13 05:36:43,749 [salt.state                                                       ][INFO    ][8299] Running state [/etc/yum.repos.d/ambari.repo] at time 05:36:43.749534
2018-12-13 05:36:43,749 [salt.state                                                       ][INFO    ][8299] Executing state file.managed for [/etc/yum.repos.d/ambari.repo]
2018-12-13 05:36:43,756 [salt.fileclient                                                  ][DEBUG   ][8299] In saltenv 'base', looking at rel_path 'ambari/yum/ambari.repo' to resolve 'salt://ambari/yum/ambari.repo'
2018-12-13 05:36:43,757 [salt.fileclient                                                  ][DEBUG   ][8299] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/ambari/yum/ambari.repo' to resolve 'salt://ambari/yum/ambari.repo'
2018-12-13 05:36:43,757 [salt.utils.jinja                                                 ][DEBUG   ][8299] Jinja search path: ['/var/cache/salt/minion/files/base']
2018-12-13 05:36:43,761 [salt.state                                                       ][INFO    ][8299] File /etc/yum.repos.d/ambari.repo is in the correct state
2018-12-13 05:36:43,761 [salt.state                                                       ][INFO    ][8299] Completed state [/etc/yum.repos.d/ambari.repo] at time 05:36:43.761816 duration_in_ms=12.282
2018-12-13 05:36:43,768 [salt.utils.lazy                                                  ][DEBUG   ][8299] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
2018-12-13 05:36:43,768 [salt.state                                                       ][INFO    ][8299] Running state [ambari-agent] at time 05:36:43.768609
2018-12-13 05:36:43,768 [salt.state                                                       ][INFO    ][8299] Executing state pkg.installed for [ambari-agent]
2018-12-13 05:36:43,769 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['rpm', '-qa', '--queryformat', '%{NAME}_|-%{EPOCH}_|-%{VERSION}_|-%{RELEASE}_|-%{ARCH}_|-(none)'] in directory '/root'
2018-12-13 05:36:44,277 [salt.utils.lazy                                                  ][DEBUG   ][8299] Could not LazyLoad pkg.check_db: 'pkg.check_db' is not available.
2018-12-13 05:36:44,284 [salt.utils.lazy                                                  ][DEBUG   ][8299] Could not LazyLoad pkg.check_extra_requirements: 'pkg.check_extra_requirements' is not available.
2018-12-13 05:36:44,290 [salt.utils.lazy                                                  ][DEBUG   ][8299] Could not LazyLoad pkg.version_clean: 'pkg.version_clean' is not available.
2018-12-13 05:36:44,290 [salt.loaded.int.module.rpm                                       ][WARNING ][8299] rpmdevtools is not installed, please install it for more accurate version comparisons
2018-12-13 05:36:44,291 [salt.loaded.int.states.pkg                                       ][DEBUG   ][8299] Current version (['2.6.2.0-155']) did not match desired version specification (2.6.2.0), adding to installation targets
2018-12-13 05:36:44,291 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'clean', 'expire-cache'] in directory '/root'
2018-12-13 05:36:44,435 [salt.loaded.int.module.cmdmod                                    ][DEBUG   ][8299] output: 
2018-12-13 05:36:44,436 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'check-update'] in directory '/root'
2018-12-13 05:36:46,398 [salt.loaded.int.module.yumpkg                                    ][DEBUG   ][8299] Searching for repos in ['/etc/yum/repos.d', '/etc/yum.repos.d']
2018-12-13 05:36:46,401 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--version'] in directory '/root'
2018-12-13 05:36:46,518 [salt.loaded.int.module.cmdmod                                    ][DEBUG   ][8299] output: 3.4.3
  Installed: rpm-4.11.3-21.75.amzn1.x86_64 at 2017-11-20 22:11
  Built    : Amazon.com, Inc. <http://aws.amazon.com> at 2017-03-20 00:58
  Committed: Amazon Linux AMI <amazon-linux-ami@amazon.com> at 2016-11-04


  Installed: yum-3.4.3-150.70.amzn1.noarch at 2017-11-20 22:11
  Built    : Amazon.com, Inc. <http://aws.amazon.com> at 2017-08-10 23:50
  Committed: Heath Petty <hpetty@amazon.com> at 2017-08-10
2018-12-13 05:36:46,519 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'AMBARI.2.6.2.0', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:46,988 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'saltstack-amzn-repo', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:47,470 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'amzn-main', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:50,893 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'HDP-2.6-repo-1', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:51,478 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'amzn-updates', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:52,686 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'HDP-UTILS', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:52,772 [salt.minion                                           ][INFO    ][3466] User root Executing command saltutil.running with jid 20181213053652757085
2018-12-13 05:36:52,772 [salt.minion                                           ][DEBUG   ][3466] Command details {'tgt_type': 'glob', 'jid': '20181213053652757085', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2018-12-13 05:36:52,781 [salt.minion                                           ][INFO    ][8478] Starting a new job with PID 8478
2018-12-13 05:36:52,795 [salt.utils.lazy                                       ][DEBUG   ][8478] LazyLoaded saltutil.running
2018-12-13 05:36:52,796 [salt.utils.lazy                                       ][DEBUG   ][8478] LazyLoaded direct_call.get
2018-12-13 05:36:52,797 [salt.minion                                           ][DEBUG   ][8478] Minion return retry timer set to 9 seconds (randomized)
2018-12-13 05:36:52,797 [salt.minion                                           ][INFO    ][8478] Returning information for job: 20181213053652757085
2018-12-13 05:36:52,798 [salt.transport.zeromq                                 ][DEBUG   ][8478] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'ip-10-0-1-68.ap-northeast-1.compute.internal', 'tcp://10.0.1.203:4506', 'aes')
2018-12-13 05:36:52,798 [salt.crypt                                            ][DEBUG   ][8478] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'ip-10-0-1-68.ap-northeast-1.compute.internal', 'tcp://10.0.1.203:4506')
2018-12-13 05:36:52,805 [salt.minion                                           ][DEBUG   ][8478] minion return: {'fun_args': [], 'jid': '20181213053652757085', 'return': [{'tgt_type': 'glob', 'jid': '20181213053640897161', 'tgt': '*', 'pid': 8299, 'ret': '', 'user': 'saltuser', 'arg': [], 'fun': 'state.highstate'}], 'retcode': 0, 'success': True, 'fun': 'saltutil.running'}
2018-12-13 05:36:53,152 [salt.loaded.int.module.cmdmod                                    ][INFO    ][8299] Executing command ['yum', '--quiet', 'repository-packages', 'HDP-UTILS-1.1.0.21-repo-1', 'list', '--showduplicates'] in directory '/root'
2018-12-13 05:36:53,778 [salt.loaded.int.module.rpm                                       ][WARNING ][8299] rpmdevtools is not installed, please install it for more accurate version comparisons
2018-12-13 05:36:53,953 [salt.state                                                       ][ERROR   ][8299] An exception occurred in this state: Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1843, in call
    **cdata['kwargs'])
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1795, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/salt/states/pkg.py", line 1631, in installed
    **kwargs)
  File "/usr/lib/python2.7/dist-packages/salt/modules/yumpkg.py", line 1415, in install
    if re.match('kernel(-.+)?', name):
  File "/usr/lib64/python2.7/re.py", line 141, in match
    return _compile(pattern, flags).match(string)
TypeError: expected string or buffer

Can you please help me find the solution or workaround?
Thank you in advance.

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

The Ambari version information seems to have changed. In Cloudbreak version 2.4.2 there is no way to update this data through API calls. You have to update it manually in the database instead:

  1. Connect to the Cloudbreak database
  2. In the cbdb schema, in the cluster table, find the cluster, on which you experienced the problem and note its id
  3. Update the Ambari version information eg.:
    UPDATE clustercomponent SET attributes = '{"predefined":false,"version":"2.6.2.0-155","baseUrl":"http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.6.2.0","gpgKeyUrl":"http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins"}' WHERE componenttype = ‘AMBARI_REPO_DETAILS’ AND cluster_id = <id_of_the_cluster>
    
  4. Verify if the update was successful

The above attribute value is just an example. Please replace it with the one in you database and edit the version field in the json to include the build number: "2.6.2.0-155"

View solution in original post

2 REPLIES 2

avatar
Cloudera Employee

The Ambari version information seems to have changed. In Cloudbreak version 2.4.2 there is no way to update this data through API calls. You have to update it manually in the database instead:

  1. Connect to the Cloudbreak database
  2. In the cbdb schema, in the cluster table, find the cluster, on which you experienced the problem and note its id
  3. Update the Ambari version information eg.:
    UPDATE clustercomponent SET attributes = '{"predefined":false,"version":"2.6.2.0-155","baseUrl":"http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.6.2.0","gpgKeyUrl":"http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins"}' WHERE componenttype = ‘AMBARI_REPO_DETAILS’ AND cluster_id = <id_of_the_cluster>
    
  4. Verify if the update was successful

The above attribute value is just an example. Please replace it with the one in you database and edit the version field in the json to include the build number: "2.6.2.0-155"

avatar

Thank you for your quick response. The workaround worked for my environment.