Support Questions

Find answers, ask questions, and share your expertise

CM cannot copy installation files to other servers

avatar
Contributor

Hello everyone.

 

I'm going through "Installation path B" for a production cluster.  MySQL installed and configured fine, Cloudera Manager starts up fine, and I'm going through the familiar steps of choosing hosts in the cluster on which to have CM install the agent software.

 

I get to the "Install Step" in CM, and one of the very first thing it does is "Copying installation files" to the other hosts.  This process never completes, and the file on the target system just sits at 0 bytes forever until the process times out in CM.  

 

My /etc/hosts file seems correct to me:

 

CM Server host:

root@bdprodm09:[57]:~> python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())"
bdprodm09.dbhotelcloud.com
172.0.30.1

 

Agent host:

root@bdprodm10:[37]:~> python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())"
bdprodm10.dbhotelcloud.com
172.0.30.2

 

I traced the packets with tcpdump between those two IPs, and port 22, and see the traffic, and no errors.  But, in the target /tmp/XXX directory, I just get this:

 

root@bdprodm10:[52]:/tmp/scm_prepare_node.f0oOug56> ls -l
total 0
-rw-r--r-- 1 root root 0 May 2 21:47 scm_prepare_node.sh

 

This particular time I tried it as root, but I've had the same thing happen with another user account (cminstall) that is set up with passwordless su - root as well:

 

root@bdprodm10:[57]:/tmp> ls -l scm_prepare_node.uF3EVkb4
total 0
-rw-r--r-- 1 cminstall cminstall 0 May 2 21:41 scm_prepare_node.sh

 

I tried becoming the 'cminstall' user and manually scp'ing a file from one host to the other, and it worked fine:

 

[cminstall@bdprodm09 ~]$ scp testfile cminstall@bdprodm10.dbhotelcloud.com:/tmp
cminstall@bdprodm10.dbhotelcloud.com's password:
testfile 100% 33 0.0KB/s 00:00

 

---

root@bdprodm10:[58]:/tmp> ls -l testfile
-rw-rw-r-- 1 cminstall cminstall 33 May 2 22:07 testfile

 

The cloudera-scm-server log file doesn't say anything that indicates what is failing:

2016-05-02 22:01:29,243 INFO 164956425@scm-web-36:com.cloudera.server.cmf.node.NodeConfiguratorService: Retrying configurator with id 0
2016-05-02 22:01:29,244 INFO 164956425@scm-web-36:com.cloudera.server.cmf.node.NodeConfiguratorService: Submitted configurator for bdprodm10.dbhotelcloud.com with id 1
2016-05-02 22:01:29,244 INFO NodeConfiguratorThread-6-1:com.cloudera.server.cmf.node.NodeConfiguratorProgress: bdprodm10.dbhotelcloud.com: Transitioning from INIT (PT0.001S) to CONNECT
2016-05-02 22:01:29,245 INFO NodeConfiguratorThread-6-1:net.schmizz.sshj.transport.TransportImpl: Client identity string: SSH-2.0-SSHJ_0_14_0
2016-05-02 22:01:29,253 INFO NodeConfiguratorThread-6-1:net.schmizz.sshj.transport.TransportImpl: Server identity string: SSH-2.0-OpenSSH_5.3
2016-05-02 22:01:29,367 INFO NodeConfiguratorThread-6-1:com.cloudera.server.cmf.node.NodeConfiguratorProgress: bdprodm10.dbhotelcloud.com: Transitioning from CONNECT (PT0.123S) to AUTHENTICATE
2016-05-02 22:01:29,458 INFO NodeConfiguratorThread-6-1:com.cloudera.server.cmf.node.NodeConfiguratorProgress: bdprodm10.dbhotelcloud.com: Transitioning from AUTHENTICATE (PT0.091S) to MAKE_TEMP_DIR
2016-05-02 22:01:29,460 INFO NodeConfiguratorThread-6-1:com.cloudera.server.cmf.node.NodeConfigurator: Executing mktemp -d /tmp/scm_prepare_node.XXXXXXXX on bdprodm10.dbhotelcloud.com
2016-05-02 22:01:29,520 INFO NodeConfiguratorThread-6-1:com.cloudera.server.cmf.node.NodeConfiguratorProgress: bdprodm10.dbhotelcloud.com: Transitioning from MAKE_TEMP_DIR (PT0.062S) to COPY_FILES

This is the first time I've had this problem!  I've done this install many times in the past, and I'm out of things to check 😞 

 

Hoping someone can shed some light on this one for me!  

 

Thanks very much for your time.

Chris

1 ACCEPTED SOLUTION

avatar
Master Collaborator

based on issues I've seen they most leads to jumbo framing configuration 😞 could you verify that is working [1]

alternative installation path is to install the packages manually on the node, then vi /etc/cloudera-scm-agent/config.ini and set the Hostname of the CM server - server_host=....

 

eg;

yum clean all
rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

# below is for the latest CM5 

# modify the baseurl in/etc/yum.repos.d/cloudera-manager.repo and point to the desire CM version

# example for version 5.5.1 baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.5.1/

wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo -O /etc/yum.repos.d/cloudera-manager.repo
yum install -y oracle-j2sdk* cloudera-manager-{daemons,agent}

 

[1] https://www.mylesgray.com/hardware/test-jumbo-frames-working/

View solution in original post

14 REPLIES 14

avatar
Contributor

Thanks again 🙂

 

It's a bonded 4x10Gb with an MTU of 9000.

 

bond0     Link encap:Ethernet  HWaddr A0:36:9F:96:76:F4  
          inet addr:172.0.30.2  Bcast:172.0.30.255  Mask:255.255.255.0
          inet6 addr: fe80::a236:9fff:fe96:76f4/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:9000  Metric:1
          RX packets:8750 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1500 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:616956 (602.4 KiB)  TX bytes:259076 (253.0 KiB)

Chris

avatar
Master Collaborator

based on issues I've seen they most leads to jumbo framing configuration 😞 could you verify that is working [1]

alternative installation path is to install the packages manually on the node, then vi /etc/cloudera-scm-agent/config.ini and set the Hostname of the CM server - server_host=....

 

eg;

yum clean all
rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

# below is for the latest CM5 

# modify the baseurl in/etc/yum.repos.d/cloudera-manager.repo and point to the desire CM version

# example for version 5.5.1 baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.5.1/

wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo -O /etc/yum.repos.d/cloudera-manager.repo
yum install -y oracle-j2sdk* cloudera-manager-{daemons,agent}

 

[1] https://www.mylesgray.com/hardware/test-jumbo-frames-working/

avatar
Contributor

Ahhh....I think you are on to something for sure!  </happy_dance>

 

chris.neal@bdprodm09:[65]:~> ping -M do -s 8972 172.0.30.2
PING 172.0.30.2 (172.0.30.2) 8972(9000) bytes of data.
^C
--- 172.0.30.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1574ms

I'm going to ping our network guy and have him validate the switch configurations.  

 

Thank you thank you thank you. 🙂

I'll report back soon, I hope!

Chris

avatar
Contributor

Well, it seems the network guy has gone home for the day, so I set all the MTUs to 1500 between the CM server host and one agent server, and everything is working great. 🙂 🙂

 

Michalis, thank you again for taking the time to read my extra long posts and get me past this issue.

 

I'll have the network guy check the switches and firewall configuration and find out where the problem is ASAP tomorrow.  One more post back telling what the final issue ends up being.

 

Thanks again.

Chris

 

avatar
Contributor

Just one last reply to confirm that an improper switch configuration was the issue here.  Once that was fixed, everything worked great!

 

Thanks again for all the help.

Chris