Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Failed to complete installation on host XYZ

avatar
Explorer

I am tring to add a new host.

but it fails with this message at the end "

Command 30(GlobalHostInstall) has completed. finalstate:FINISHED, success:false, msg:Failed to complete installation.

"

 

the following is the "/var/log/cloudera-scm-server/cloudera-scm-server.log" 

 

10:58:25.348 AM INFO ServiceHandlerRegistry Executing command GlobalHostInstall GlobalHostInstallCommandArgs{sshPort=22, userName=butmah, password=REDACTED, passphrase=REDACTED, privateKey=REDACTED, parallelInstallCount=10, cmRepoUrl=null, gpgKeyCustomUrl=null, gpgKeyOverrideBundle=<none>, unlimitedJCE=false, javaInstallStrategy=NONE, agentUserMode=ROOT, cdhVersion=-1, cdhRelease=NONE, cdhRepoUrl=null, buildCertCommand=, sslCertHostname=null, reqId=25, skipPackageInstall=false, skipCloudConfig=false, proxyProtocol=HTTP, proxyServer=10.4.32.3, proxyPort=8080, proxyUserName=null, proxyPassword=REDACTED, hosts=[node1, node2, node3, node4, node5], existingHosts=[]}.

 
10:58:25.349 AM INFO CmdStep Executing command work: Execute 1 steps in sequence

 
10:58:25.349 AM INFO CmdStep Executing command work: Install on 1 hosts.

 
10:58:25.349 AM INFO CmdStep Executing command work: Install on node5.

 
10:58:25.350 AM INFO NodeConfiguratorService Adding password-based configurator for node5

 
10:58:25.350 AM INFO NodeConfiguratorService Submitted configurator for node5 with id 30

 
10:58:25.357 AM INFO NodeConfiguratorProgress node5: Transitioning from INIT (PT0.008S) to CONNECT

 
10:58:25.359 AM INFO TransportImpl Client identity string: SSH-2.0-SSHJ_0_14_0

 
10:58:25.361 AM INFO JavaMelodyFacade Exiting HTTP Operation: Method:POST, Path:/add-hosts-wizard/installretry.json, Status:200

 
10:58:25.368 AM INFO TransportImpl Server identity string: SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.8

 
10:58:25.422 AM INFO NodeConfiguratorProgress node5: Transitioning from CONNECT (PT0.065S) to AUTHENTICATE

 
10:58:25.474 AM INFO NodeConfiguratorProgress node5: Transitioning from AUTHENTICATE (PT0.052S) to MAKE_TEMP_DIR

 
10:58:25.616 AM INFO NodeConfigurator Executing mktemp -d /tmp/scm_prepare_node.XXXXXXXX on node5

 
10:58:25.620 AM INFO NodeConfiguratorProgress node5: Transitioning from MAKE_TEMP_DIR (PT0.146S) to COPY_FILES

 
10:58:25.719 AM INFO NodeConfigurator Using key bundle from URL: https://archive.cloudera.com/cm6/6.2.0/allkeys.asc

 
10:58:26.063 AM INFO NodeConfiguratorProgress node5: Transitioning from COPY_FILES (PT0.443S) to CHMOD

 
10:58:26.068 AM INFO NodeConfigurator Executing chmod a+x /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.sh on node5

 
10:58:26.074 AM INFO NodeConfiguratorProgress node5: Transitioning from CHMOD (PT0.011S) to EXECUTE_SCRIPT

 
10:58:26.142 AM INFO NodeConfigurator Executing bash -c 'bash /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.sh  --server_version 6.2.0 --server_build 968826 --packages /tmp/scm_prepare_node.vQZe0yDf/packages.scm --always /tmp/scm_prepare_node.vQZe0yDf/always_install.scm --x86_64 /tmp/scm_prepare_node.vQZe0yDf/x86_64_packages.scm --certtar /tmp/scm_prepare_node.vQZe0yDf/cert.tar --unlimitedJCE false --javaInstallStrategy NONE --agentUserMode ROOT --cm https://archive.cloudera.com/cm6/6.2.0 --skipCloudConfig false | tee /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.log; exit ${PIPESTATUS[0]}' on node5

 
10:58:27.145 AM INFO NodeConfiguratorProgress node5: Transitioning from EXECUTE_SCRIPT (PT1.071S) to SCRIPT_START

 
10:58:27.145 AM INFO NodeConfiguratorProgress node5: Transitioning from SCRIPT_START (PT0S) to TAKE_LOCK

 
10:58:27.145 AM INFO NodeConfiguratorProgress node5: Transitioning from TAKE_LOCK (PT0S) to DETECT_ROOT

 
10:58:27.145 AM INFO NodeConfiguratorProgress node5: Transitioning from DETECT_ROOT (PT0S) to DETECT_DISTRO

 
10:58:27.145 AM INFO NodeConfiguratorProgress node5: Transitioning from DETECT_DISTRO (PT0S) to DETECT_SCM

 
10:58:28.146 AM INFO NodeConfiguratorProgress node5: Transitioning from DETECT_SCM (PT1.001S) to REPO_INSTALL

 
10:58:28.146 AM INFO NodeConfiguratorProgress node5: Transitioning from REPO_INSTALL (PT0S) to REFRESH_METADATA

 
10:58:30.354 AM INFO JavaMelodyFacade Entering HTTP Operation: Method:POST, Path:/add-hosts-wizard/installprogressdata.json

 
10:58:30.355 AM INFO JavaMelodyFacade Exiting HTTP Operation: Method:POST, Path:/add-hosts-wizard/installprogressdata.json, Status:200

 
10:58:32.079 AM INFO AgentAvroServlet (3 skipped) AgentAvroServlet: heartbeat processing stats: average=2ms, min=1ms, max=34ms.

 
10:58:32.150 AM INFO NodeConfiguratorProgress node5: Transitioning from REFRESH_METADATA (PT4.004S) to PACKAGE_INSTALL cloudera-manager-agent

 
10:58:32.160 AM WARN NodeConfigurator Command bash -c 'bash /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.sh  --server_version 6.2.0 --server_build 968826 --packages /tmp/scm_prepare_node.vQZe0yDf/packages.scm --always /tmp/scm_prepare_node.vQZe0yDf/always_install.scm --x86_64 /tmp/scm_prepare_node.vQZe0yDf/x86_64_packages.scm --certtar /tmp/scm_prepare_node.vQZe0yDf/cert.tar --unlimitedJCE false --javaInstallStrategy NONE --agentUserMode ROOT --cm https://archive.cloudera.com/cm6/6.2.0 --skipCloudConfig false | tee /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.log; exit ${PIPESTATUS[0]}' on node5 finished with exit status 1

 
10:58:32.160 AM INFO NodeConfiguratorProgress node5: Setting PACKAGE_INSTALL cloudera-manager-agent as failed and done state

 
10:58:32.160 AM INFO TransportImpl Disconnected - BY_APPLICATION

 
10:58:35.363 AM INFO JavaMelodyFacade Entering HTTP Operation: Method:POST, Path:/add-hosts-wizard/installprogressdata.json

 
10:58:35.364 AM INFO JavaMelodyFacade Exiting HTTP Operation: Method:POST, Path:/add-hosts-wizard/installprogressdata.json, Status:200

 
10:58:35.369 AM INFO JavaMelodyFacade Entering HTTP Operation: Method:POST, Path:/express-wizard/updateHostsState

 
10:58:35.370 AM INFO JavaMelodyFacade Exiting HTTP Operation: Method:POST, Path:/express-wizard/updateHostsState, Status:200

 
10:58:35.375 AM ERROR WorkOutputs CMD id: 30 Failed to complete installation on host node5.

 
10:58:35.375 AM ERROR DbCommand Command 30(GlobalHostInstall) has completed. finalstate:FINISHED, success:false, msg:Failed to complete installation.

 
1 ACCEPTED SOLUTION

avatar
Explorer

 @denloe thank you.

 

First: to answer you question, yes, the "cloudera-manager.list" is there in "/etc/apt/sources.list.d".

 

Second: I have tried to install the Cloudera Manager Agent manually but I got the error  of:

sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package cloudera-manager-agent
E: Unable to locate package cloudera-manager-daemons

 

So I have decided to configure the repository on the host, manually, and from scratch; so:

  1. I have deleted the existing "cloudera-manager.list"
    sudo rm /etc/apt/sources.list.d/cloudera-manager.list
    Then copy the one that I have one my cloudera manager server.

  2. Then followed what is in "Step 1: Configure a Repository"
    wget https://archive.cloudera.com/cm6/6.2.0/ubuntu1604/apt/archive.key
    sudo apt-key add archive.key
    sudo apt-get update
  3. After that "Manually Install Cloudera Manager Agent Packages"
    sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
    then add/modify the server_host name in "/etc/cloudera-scm-agent/config.ini" of the new host. 
    And finally start the agent:
    sudo systemctl start cloudera-scm-agent

And guess what! ... it is working now, and I could add the host to the cluster!

 

I think there is something wrong with the installer of 6.2. And that wrong this is not just in installing the cloudera manager agent, but also in installing the Oracle JDK, which gives an error message that the Oracle JDK package does not exist!
This Oracle JDK installation issue forced me to Manually Installing OpenJDK on the new hosts, and that caused another problem! Now I am having Oracle JDK 1.8 on my cloudera server master node, but "openjdk version 1.8.0_212" on the other nodes. And whenever I add a new host I got a warning that there is inconsistency in java and that will cause failures! .... now my question is how can I turn my cloudera server master node to "openjdk version 1.8.0_212"? is it just Manually Installing OpenJDK and this will take the place of the existing Oracle JDK 1.8? or I have to do cleanups before that, and more configurations after that?

 

View solution in original post

4 REPLIES 4

avatar
Explorer

More information:

The following is the "/tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.log" on the failing host:

 

using SSH_CLIENT to get the SCM hostname: 10.4.34.22 37758 22
opening logging file descriptor
###CLOUDERA_SCM### SCRIPT_START
###CLOUDERA_SCM### TAKE_LOCK
BEGIN flock 4
END (0)
###CLOUDERA_SCM### DETECT_ROOT
effective UID is 1000
BEGIN which pbrun
END (1)
BEGIN sudo -S id
uid=0(root) gid=0(root) groups=0(root)
END (0)
Using 'sudo ' to acquire root privileges
###CLOUDERA_SCM### DETECT_DISTRO
BEGIN grep 'Ubuntu' /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
END (0)
BEGIN grep DISTRIB_CODENAME /etc/lsb-release
DISTRIB_CODENAME=xenial
END (0)
BEGIN echo DISTRIB_CODENAME=xenial | cut -d = -f 2
xenial
END (0)
###CLOUDERA_SCM### DETECT_SCM
BEGIN host -t PTR 10.4.34.22
Host 22.34.4.10.in-addr.arpa. not found: 3(NXDOMAIN)
END (1)
BEGIN which python
/usr/bin/python
END (0)
BEGIN python -c 'import socket; import sys; s = socket.socket(socket.AF_INET); s.settimeout(5.0); s.connect((sys.argv[1], int(sys.argv[2]))); s.close();' 10.4.34.22 7182
END (0)
BEGIN which wget
/usr/bin/wget
END (0)
BEGIN wget -qO- -T 1 -t 1 http://169.254.169.254/latest/meta-data/public-hostname && /bin/echo
END (4)
###CLOUDERA_SCM### REPO_INSTALL
Checking https://archive.cloudera.com/cm6/6.2.0/ubuntu1604/apt/dists/
Checking https://archive.cloudera.com/cm6/6.2.0/dists/
Using
installing repository file /tmp/scm_prepare_node.vQZe0yDf/repos/ubuntu_xenial/cloudera-manager.list
repository file /tmp/scm_prepare_node.vQZe0yDf/repos/ubuntu_xenial/cloudera-manager.list installed
installing apt keys
BEGIN sudo apt-key add /tmp/scm_prepare_node.vQZe0yDf/customGPG
OK
END (0)
installing priority file /tmp/scm_prepare_node.vQZe0yDf/ubuntu_xenial
priority file /tmp/scm_prepare_node.vQZe0yDf/ubuntu_xenial installed
###CLOUDERA_SCM### REFRESH_METADATA
BEGIN sudo apt-get update
Hit:1 http://security.ubuntu.com/ubuntu xenial-security InRelease
Hit:2 http://us.archive.ubuntu.com/ubuntu xenial InRelease
Hit:3 http://us.archive.ubuntu.com/ubuntu xenial-updates InRelease
Hit:4 http://us.archive.ubuntu.com/ubuntu xenial-backports InRelease
Reading package lists...
END (0)
BEGIN sudo apt-get update
Hit:1 http://us.archive.ubuntu.com/ubuntu xenial InRelease
Hit:2 http://security.ubuntu.com/ubuntu xenial-security InRelease
Hit:3 http://us.archive.ubuntu.com/ubuntu xenial-updates InRelease
Hit:4 http://us.archive.ubuntu.com/ubuntu xenial-backports InRelease
Reading package lists...
END (0)
###CLOUDERA_SCM### PACKAGE_INSTALL cloudera-manager-agent
BEGIN sudo dpkg -l cloudera-manager-agent | grep -E '^ii[[:space:]]*cloudera-manager-agent[[:space:]]*'
dpkg-query: no packages found matching cloudera-manager-agent
END (1)
BEGIN sudo apt-cache show cloudera-manager-agent
E: No packages found
END (100)
cloudera-manager-agent must have Version=6.2.0 and Build=968826, exiting
closing logging file descriptor

 

and this is what I have under "/tmp/scm_prepare_node.vQZe0yDf/" folder:

Capture.JPG

avatar
Community Manager

The log shows the Cloudera repo is being installed, but the 'apt-get update' does not show the cloudera-manager repo being referenced.

 

Because the cloudera-manager rep is not indexed, the attempt to install the cloudera-manager-agent fails.

 

Could you check that the repo file "cloudera-manager.list" was correctly installed by the script.  I believe it would be located under /etc/apt/sources.list.d.

 

Also check that your system can reach archive.cloudera.com (the repo is under http://archive.cloudera.com/cm6/) and is not blocked by corporate, cloud, or local firewall rules.



David Wilder, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Explorer

 @denloe thank you.

 

First: to answer you question, yes, the "cloudera-manager.list" is there in "/etc/apt/sources.list.d".

 

Second: I have tried to install the Cloudera Manager Agent manually but I got the error  of:

sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package cloudera-manager-agent
E: Unable to locate package cloudera-manager-daemons

 

So I have decided to configure the repository on the host, manually, and from scratch; so:

  1. I have deleted the existing "cloudera-manager.list"
    sudo rm /etc/apt/sources.list.d/cloudera-manager.list
    Then copy the one that I have one my cloudera manager server.

  2. Then followed what is in "Step 1: Configure a Repository"
    wget https://archive.cloudera.com/cm6/6.2.0/ubuntu1604/apt/archive.key
    sudo apt-key add archive.key
    sudo apt-get update
  3. After that "Manually Install Cloudera Manager Agent Packages"
    sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
    then add/modify the server_host name in "/etc/cloudera-scm-agent/config.ini" of the new host. 
    And finally start the agent:
    sudo systemctl start cloudera-scm-agent

And guess what! ... it is working now, and I could add the host to the cluster!

 

I think there is something wrong with the installer of 6.2. And that wrong this is not just in installing the cloudera manager agent, but also in installing the Oracle JDK, which gives an error message that the Oracle JDK package does not exist!
This Oracle JDK installation issue forced me to Manually Installing OpenJDK on the new hosts, and that caused another problem! Now I am having Oracle JDK 1.8 on my cloudera server master node, but "openjdk version 1.8.0_212" on the other nodes. And whenever I add a new host I got a warning that there is inconsistency in java and that will cause failures! .... now my question is how can I turn my cloudera server master node to "openjdk version 1.8.0_212"? is it just Manually Installing OpenJDK and this will take the place of the existing Oracle JDK 1.8? or I have to do cleanups before that, and more configurations after that?

 

avatar
Community Manager

Since you were able to access the Cloudera repo, you should be able to install the Oracle JDK on the agent with:

 

 

sudo apt-get install oracle-j2sdk1.8

 

If that does not work, you can install it with the instructions under Manually Installing OpenJDK

 

When starting, the agent looks through common paths for JDKs and will select the Oracle 1.8 JDK first if it finds it.

 

If you would prefer to use OpenJDK on all the systems, in Cloudera Manager navigate to Hosts > Hosts Configuration and set the Java Home Directory to your preferred JAVA_HOME.



David Wilder, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum