Created 10-04-2017 08:32 AM
While installing 5.12.1 on AWS getting errors as below with 4 node cluster. It says something about Failed adding Torrent. How to fix?
Failed adding torrent: file:///opt/cloudera/parcel-cache/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel.torrent Already present torrent
exception_msg: Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist,
Below is the log output of /var/log/cloudera-scm-agent/cloudera-scm-agent.log
----------------------------------- cloudera-scm-agent.log output -----------------------------
[04/Oct/2017 15:07:23 +0000] 22003 Thread-13 downloader INFO Finished download [ url: http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p..., state: exception, total_bytes: 1741165383, downloaded_bytes: 1741165383, start_time: 2017-10-04 15:07:23, download_end_time: 2017-10-04 15:07:23, end_time: 2017-10-04 15:07:23, code: 601, exception_msg: Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist, path: /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel ]
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Fetching torrent: http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p...
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Starting download of: http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p...
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Failed adding torrent: file:///opt/cloudera/parcel-cache/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel.torrent Already present torrent: CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Current state: CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel [totalDownloaded=1741165383 totalSize=1741165383 upload=0 state=seeding seed=['http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p...'] location=/opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel progress=1000000]
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Completed download of http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p... code=200 state=downloaded
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 parcel_cache WARNING No checksum in header, skipping verification
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 parcel_cache INFO Unpacking /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel into /opt/cloudera/parcels
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader ERROR Failed op: Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/downloader.py", line 501, in callable
callback(url, curr_op)
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/parcel_cache.py", line 203, in cb
raise e
Exception: Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist
[04/Oct/2017 15:07:38 +0000] 22003 Thread-13 downloader INFO Finished download [ url: http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p..., state: exception, total_bytes: 1741165383, downloaded_bytes: 1741165383, start_time: 2017-10-04 15:07:38, download_end_time: 2017-10-04 15:07:38, end_time: 2017-10-04 15:07:38, code: 601, exception_msg: Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist, path: /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel ]
(END)
Created 10-04-2017 02:43 PM
Although logout of CM and restart of CM didnt solve the issue, some wise soul had found out a solution earlier: http://community.cloudera.com/t5/Cloudera-Manager-Installation/New-Install-Hosts-show-quot-Currently...
So went ahead and deleted the Cluster 1 in Cloudera Manager home page. After that logging back into CM showed up the 4 hosts and could continue with the install process which distributed and activated all 4 hosts. Now continuing with next steps of configuring services and databases. Thanks for all your help which resolved this tricky issue! Also lesson learnt is that minimum disk space needed for successful install is 16GB on the Cloudera manager installer node and 12GB on the other nodes for a poor man's small cluster and probably should not go with less than 30GB on each server to be safe 🙂
Created 10-04-2017 09:12 AM
Based on the agent log, it appears that agent showed a successful download of the parcel file:
Finished download [ url: http://ip-172-31-42-243.us-east-2.compute.internal:7180/cmf/parcel/download/CDH-5.12.1-1.cdh5.12.1.p..., state: exception, total_bytes: 1741165383, downloaded_bytes: 1741165383, start_time: 2017-10-04 15:07:23, download_end_time: 2017-10-04 15:07:23, end_time: 2017-10-04 15:07:23, code
However, even on the fastest of machines, I would expect download of "1741165383" bytes to take longer than under 1 second. So I am suspicious something didn't go right.
Further more, when an attempt is made by the agent to unpack the parcel, the parcel file does not exist:
Src file /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel does not exist, path: /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel
Recommendation:
Verify that the file exists. As a root / sudo user:
ls -l /opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel
If it does not exist, that indicates a problem being able to write to the .flood directory.
I suspect permissions may be at play. Check to make sure that the /opt/cloudera/parcels directory is owned by the Cloudera Manager user (typically cloudera-scm)
Let us know what you find.
Created on 10-04-2017 11:17 AM - edited 10-04-2017 11:26 AM
Thanks based on your suggestions, I looked and did not find the opt/cloudera/parcels/.flood/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel/CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel file.
Luckily I also noticed that during the installation the "/" disk space was 100% full so I went ahead and ran
$ sudo service cloudera-scm-server stop
stopped all the AWS instances and added 2GB to each instance and started the instances.
After that ran
$ sudo service cloudera-scm-server start and logged into the cmhost:7180 and continued install.
Now this time 3 of the 4 hosts successfully activated. I again added some 2GB to the unsuccessful host and trying to run install but it is not allowing me to proceed ahead from the Specify hosts for your CDH cluster search hosts screen. It shows all fours hosts and Currently managed=YES. But the checkboxes are all grey and the Continue button is gray so I cannot move ahead from this screen.
Also the Parcel status page shows parcel activated on Cluster 1:.
CDH 5 | 5.12.1-1.cdh5.12.1.p0.3 | Distributed, Activated |
How to fix and continue ahead install? Thanks!
Created 10-04-2017 11:28 AM
Good news finding the space issue!
It sounds as if your browser session has gotten into a bad state.
At this stage, I'd recommend starting over with the new cluster wizard. You can log out of Cloudera Manager and come back in. That should present you with the first page of the Add New Cluster Wizard.
If not, you can log out, restart Cloudera Manager, and then try again.
If it still fails on the one host, check the agent log on that host and paste information that is relevant.
Ben
Created 10-04-2017 02:43 PM
Although logout of CM and restart of CM didnt solve the issue, some wise soul had found out a solution earlier: http://community.cloudera.com/t5/Cloudera-Manager-Installation/New-Install-Hosts-show-quot-Currently...
So went ahead and deleted the Cluster 1 in Cloudera Manager home page. After that logging back into CM showed up the 4 hosts and could continue with the install process which distributed and activated all 4 hosts. Now continuing with next steps of configuring services and databases. Thanks for all your help which resolved this tricky issue! Also lesson learnt is that minimum disk space needed for successful install is 16GB on the Cloudera manager installer node and 12GB on the other nodes for a poor man's small cluster and probably should not go with less than 30GB on each server to be safe 🙂
Created on 10-04-2017 08:53 PM - edited 10-04-2017 09:30 PM
Now I am getting a new error and install got stuck at below. Any help on this?
Cluster Setup
There was an error when communicating with the server. See the log file for more information.
0/2 steps completed.
Looks like some URLError: <urlopen error [Errno 111] Connection refused>
When I logoff and log back to CM none of the services are started.
------------------- CM agent log file output---------------------------------------
[05/Oct/2017 03:06:45 +0000] 1482 MainThread agent INFO Triggering supervisord update.
[05/Oct/2017 03:06:46 +0000] 1482 MainThread process INFO Begin audit plugin refresh
[05/Oct/2017 03:06:46 +0000] 1482 MainThread navigator_plugin INFO Scheduling a refresh for Audit Plugin for zookeeper-server with pipelines []
[05/Oct/2017 03:06:46 +0000] 1482 MainThread process INFO Begin metadata plugin refresh
[05/Oct/2017 03:06:46 +0000] 1482 MainThread navigator_plugin INFO Scheduling a refresh for Metadata Plugin for zookeeper-server with pipelines []
[05/Oct/2017 03:06:46 +0000] 1482 MainThread __init__ INFO Instantiating generic monitor for service ZOOKEEPER and role SERVER
[05/Oct/2017 03:06:46 +0000] 1482 MainThread process INFO Begin monitor refresh.
[05/Oct/2017 03:06:46 +0000] 1482 MainThread abstract_monitor INFO Refreshing GenericMonitor ZOOKEEPER-SERVER for None
[05/Oct/2017 03:06:46 +0000] 1482 MainThread __init__ INFO New monitor: (<cmf.monitor.generic.GenericMonitor object at 0x7faa5908e490>,)
[05/Oct/2017 03:06:46 +0000] 1482 MainThread process INFO Daemon refresh complete for process 70-zookeeper-server.
[05/Oct/2017 03:06:46 +0000] 1482 MainThread navigator_plugin INFO Scheduling a refresh for Metadata Plugin for spark with pipelines []
[05/Oct/2017 03:06:47 +0000] 1482 Audit-Plugin navigator_plugin INFO stopping Audit Plugin for zookeeper-init with pipelines []
[05/Oct/2017 03:06:47 +0000] 1482 Audit-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for zookeeper-init (log dir: None)
[05/Oct/2017 03:06:47 +0000] 1482 Audit-Plugin navigator_plugin INFO Refreshing Audit Plugin for zookeeper-server with pipelines []
[05/Oct/2017 03:06:47 +0000] 1482 Audit-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for zookeeper-server (log dir: None)
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_plugin INFO stopping Metadata Plugin for zookeeper-init with pipelines []
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for zookeeper-init (log dir: None)
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_plugin INFO Refreshing Metadata Plugin for spark with pipelines []
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin __init__ INFO Read metadata config for role GATEWAY. Metadata server url: http://ip-172-31-42-243.us-east-2.compute.internal:7187
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_thread INFO Log entry poll interval: 5
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_thread INFO Navigator server url: http://ip-172-31-42-243.us-east-2.compute.internal:7187
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin inotify_event_processor INFO Setting watch for directories:
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_plugin INFO Refreshing Metadata Plugin for zookeeper-server with pipelines []
[05/Oct/2017 03:06:48 +0000] 1482 Metadata-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for zookeeper-server (log dir: None)
[05/Oct/2017 03:06:58 +0000] 1482 Audit-Plugin navigator_thread INFO Done processing navigator log None, switching to next log mgmt-NAVIGATORMETASERVER-fc45f156058af38149758a8f42863c3d-1507172816
397.
[05/Oct/2017 03:07:30 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-NAVIGATORMETASERVER for None',)
[05/Oct/2017 03:07:30 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-ACTIVITYMONITOR for mgmt-ACTIVITYMONITOR-fc45f156058af38149758a8f42863c3d',)
[05/Oct/2017 03:07:30 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('GenericMonitor MGMT-NAVIGATOR for mgmt-NAVIGATOR-fc45f156058af38149758a8f42863c3d',)
[05/Oct/2017 03:07:30 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-SERVICEMONITOR for None',)
[05/Oct/2017 03:07:34 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-REPORTSMANAGER for mgmt-REPORTSMANAGER-fc45f156058af38149758a8f42863c3d',)
[05/Oct/2017 03:07:34 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-EVENTSERVER for mgmt-EVENTSERVER-fc45f156058af38149758a8f42863c3d',)
[05/Oct/2017 03:07:34 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-ALERTPUBLISHER for mgmt-ALERTPUBLISHER-fc45f156058af38149758a8f42863c3d',)
[05/Oct/2017 03:07:34 +0000] 1482 MonitorDaemon-Scheduler __init__ INFO Monitor expired: ('GenericMonitor MGMT-HOSTMONITOR for None',)
[05/Oct/2017 03:07:35 +0000] 1482 Monitor-GenericMonitor throttling_logger ERROR Error fetching metrics at 'http://ip-172-31-42-243.us-east-2.compute.internal:8087/jmx'
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 200, in _collect_and_parse_and_return
self._adapter.safety_valve))
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/url_util.py", line 207, in urlopen_with_retry_on_authentication_errors
return function()
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 217, in _open_url
password=self._password_value)
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/url_util.py", line 70, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
[05/Oct/2017 03:07:36 +0000] 1482 Monitor-GenericMonitor throttling_logger ERROR Error fetching metrics at 'http://ip-172-31-42-243.us-east-2.compute.internal:8086/jmx'
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 200, in _collect_and_parse_and_return
self._adapter.safety_valve))
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/url_util.py", line 207, in urlopen_with_retry_on_authentication_errors
return function()
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 217, in _open_url
password=self._password_value)
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/url_util.py", line 70, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
[05/Oct/2017 03:07:36 +0000] 1482 Monitor-GenericMonitor throttling_logger ERROR Error fetching metrics at 'http://ip-172-31-42-243.us-east-2.compute.internal:7187/debug/jmx'
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 200, in _collect_and_parse_and_return
self._adapter.safety_valve))
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/url_util.py", line 207, in urlopen_with_retry_on_authentication_errors
return function()
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.1-py2.7.egg/cmf/monitor/generic/metric_collectors.py", line 217, in _open_url
------------------------ CM server log output says some database error--------------------------------
2017-10-05 03:08:22,359 WARN JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 12477ms: no GCs detected.
2017-10-05 03:08:26,017 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 2584ms: no GCs detected.
2017-10-05 03:08:26,065 WARN Timer-0:com.mchange.v2.async.ThreadPoolAsynchronousRunner: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@a291156 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
2017-10-05 03:08:26,130 INFO ScheduledStalenessChecker:com.cloudera.cmf.command.flow.CmdStep: Executing command work: SeqCmdWork of 1 steps in sequence
2017-10-05 03:08:26,151 INFO ScheduledStalenessChecker:com.cloudera.cmf.command.flow.CmdStep: Executing command work: Configuration Staleness Check
2017-10-05 03:08:31,345 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 2321ms: no GCs detected.
2017-10-05 03:08:35,273 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1612ms: no GCs detected.
2017-10-05 03:08:40,900 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 2214ms: no GCs detected.
2017-10-05 03:08:42,123 WARN avro-servlet-hb-processor-0:org.hibernate.engine.jdbc.spi.SqlExceptionHelper: SQL Error: 0, SQLState: null
2017-10-05 03:08:42,135 ERROR avro-servlet-hb-processor-0:org.hibernate.engine.jdbc.spi.SqlExceptionHelper: An attempt by a client to checkout a Connection has timed out.
2017-10-05 03:08:42,206 WARN Timer-0:com.mchange.v2.async.ThreadPoolAsynchronousRunner: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@a291156 -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@31e01998 (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0)
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@6fb063bc (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2)
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@256404ed (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1)
Pending Tasks:
com.mchange.v2.c3p0.stmt.GooGooStatementCache$1StmtAcquireTask@497f76c8
com.mchange.v2.c3p0.stmt.GooGooStatementCache$1StmtAcquireTask@4f4f6a2a
Pool thread stack traces:
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
2017-10-05 03:08:42,975 WARN avro-servlet-hb-processor-0:com.cloudera.server.cmf.AgentProtocolImpl: Internal failure of Heartbeat processing for host id: 3d0d65fa-1abb-4671-bb95-630823a550c4
javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Could not open connection
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1387)
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310)
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1397)
at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:62)
at com.cloudera.enterprise.AbstractWrappedEntityManager.begin(AbstractWrappedEntityManager.java:69)
at com.cloudera.cmf.persist.CmfEntityManager.begin(CmfEntityManager.java:299)
at com.cloudera.server.cmf.AgentProtocolImpl.heartbeat(AgentProtocolImpl.java:220)
at sun.reflect.GeneratedMethodAccessor991.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:88)
at org.apache.avro.ipc.Responder.respond(Responder.java:149)
at org.apache.avro.ipc.Responder.respond(Responder.java:99)
at com.cloudera.server.common.HttpConnectorServer$FunctionsImpl.invoke(HttpConnectorServer.java:105)
at com.cloudera.server.common.HttpConnectorServer$FunctionsImpl.invoke(HttpConnectorServer.java:88)
at com.cloudera.server.common.AgentAvroServlet$ProcessHBRequestTask$1.get(AgentAvroServlet.java:160)
at com.cloudera.server.common.MovingStats.measure(MovingStats.java:41)
at com.cloudera.server.common.AgentAvroServlet$ProcessHBRequestTask.call(AgentAvroServlet.java:153)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
:
Created 01-08-2018 09:49 PM
Were you able to resolve the issue? Since the error says connection refused. You can check on the following points:
1. /etc/hosts file has the correct entry with IP and hostname with fqdn
2. firewall service is off
3. network is up and running.
4. selinux is disabled.
5. Inter-reachability within VMs and internet is working.
Let me know if this doesnt work
Created 01-08-2018 09:52 PM
Thanks will check these items and update.
Created 05-09-2020 09:58 AM
I did the following steps and it helped
1)
[root@cm_scm_103 cloudera-scm-agent]# cp /opt/cloudera/parcel-repo/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent /opt/cloudera/parcel-cache/
[root@cm_scm_103 cloudera-scm-agent]# ls -ltr /opt/cloudera/parcel-cache/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent
-rw-r----- 1 root root 211751 May 9 11:14 /opt/cloudera/parcel-cache/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent
You have mail in /var/spool/mail/root
[root@cm_scm_103 cloudera-scm-agent]#
2)
make sure enough free space in /opt/cloudera/parcels
3) Restarted cloudera-scm-agent in server_b01, server_b02 and cm_scm_103
[root@server_b01 parcel-cache]# systemctl restart cloudera-scm-agent
4) Restarted cloudera-scm-server in cm_scm_103
[root@cm_scm_103 cloudera-scm-agent]# sudo service cloudera-scm-server restart