Created on
10-09-2019
06:59 PM
- last edited on
10-10-2019
04:37 AM
by
VidyaSargur
I set up 4 machines (Ubuntu 16.04, CDH 5.14.4 ) as a cluster, but now, after I start them, the master node can't be recognized by Cloudera. Each machine has 8G memory.
I also have some other problems, like cloudera manager service can't start. The more detail is posted in this post:https://community.cloudera.com/t5/Support-Questions/cannot-start-Cloudera-manager-service/m-p/279713...
Created 10-11-2019 02:27 AM
I expand the server's memory to 16G, and each agent still has 8G.
I did the following steps to start up the cluster:
(1)stop all the agents by using the command:
/opt/cm-5.14.4/etc/init.d/cloudera-scm-agent stop
(2) stop the server :/opt/cm-5.14.4/etc/init.d/cloudera-scm-server stop
(3)start server: /opt/cm-5.14.4/etc/init.d/cloudera-scm-server start
(4)start agents:/opt/cm-5.14.4/etc/init.d/cloudera-scm-agent start
But the problem remains the same.
The cloudera-scm-server.log says:
INFO CMMetricsForwarder-0:com.cloudera.server.cmf.components.ClouderaManagerMetricsForwarder: Failed to send metrics.
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy113.writeMetrics(Unknown Source)
at com.cloudera.server.cmf.components.ClouderaManagerMetricsForwarder.sendWithAvro(ClouderaManagerMetricsForwarder.java:325)
at com.cloudera.server.cmf.components.ClouderaManagerMetricsForwarder.sendMetrics(ClouderaManagerMetricsForwarder.java:312)
at com.cloudera.server.cmf.components.ClouderaManagerMetricsForwarder.run(ClouderaManagerMetricsForwarder.java:146)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:88)
... 11 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at org.apache.avro.ipc.HttpTransceiver.writeBuffers(HttpTransceiver.java:71)
at org.apache.avro.ipc.Transceiver.transceive(Transceiver.java:58)
at org.apache.avro.ipc.Transceiver.transceive(Transceiver.java:72)
at org.apache.avro.ipc.Requestor.request(Requestor.java:147)
at org.apache.avro.ipc.Requestor.request(Requestor.java:101)
at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
... 11 more
Please help me to solve this puzzling problem.
Created 10-11-2019 03:05 AM
Let me give more information:the CDH version is 5.14.4, and the OS system is Ubuntu 16.04.6.
The cloudera-scm-agent.log says:
MonitorDaemon-Reporter throttling_logger ERROR (9 skipped) Error sending messages to firehose: mgmt-HOSTMONITOR-e3dc96ab2a3f384d8dace815d2dddeaf
Traceback (most recent call last):
File "/opt/cm-5.14.4/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.14.4-py2.7.egg/cmf/monitor/firehose.py", line 120, in _send
self._port)
File "/opt/cm-5.14.4/lib/cmf/agent/build/env/lib/python2.7/site-packages/avro-1.6.3-py2.7.egg/avro/ipc.py", line 469, in __init__
self.conn.connect()
File "/usr/lib/python2.7/httplib.py", line 846, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 575, in create_connection
raise err