Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Regarding Cloudera Management Services restart

Regarding Cloudera Management Services restart

Explorer

Hi All,

 

I had been using Cloudera(CDH5.2.1) Kerberized cluster for long time.Two days ago,I stopped all servers and then next day,when I tried to restart Cloudera manager services, getting below error.

 

 

 

Start this Event Server

 

Event Server, ip-10-0-234-79

 

Aborted

 

Apr 30, 2015 5:55:30 AM EDT

 

Apr 30, 2015 5:58:03 AM EDT

 

 

 

 

Command aborted because of exception: Command timed-out after 150 seconds

 

 

Start this Host Monitor

 

Host Monitor, ip-10-0-234-79

 

Aborted

 

Apr 30, 2015 5:55:30 AM EDT

 

Apr 30, 2015 5:58:03 AM EDT

 

 

 

 

Command aborted because of exception: Command timed-out after 150 seconds

 

 

Start this Activity Monitor

 

Activity Monitor, ip-10-0-234-79

 

Aborted

 

Apr 30, 2015 5:55:30 AM EDT

 

Apr 30, 2015 5:58:03 AM EDT

 

 

 

 

Command aborted because of exception: Command timed-out after 150 seconds

 

 

Start this Service Monitor

 

Service Monitor, ip-10-0-234-79

 

Aborted

 

Apr 30, 2015 5:55:30 AM EDT

 

Apr 30, 2015 5:58:03 AM EDT

 

 

 

 

Command aborted because of exception: Command timed-out after 150 seconds

 

 

Start this Alert Publisher

 

Alert Publisher, ip-10-0-234-79

 

Aborted

 

Apr 30, 2015 5:55:30 AM EDT

 

Apr 30, 2015 5:58:03 AM EDT

 

 

 

 Command aborted because of exception: Command timed-out after 150 seconds

 

 

I tried to resolve this issue by following approaches.

 

1)Verified password less communication,selinux(disabled) and iptables stopped.

2)Stopped all cloudera agents and server in all machines and restarted.

3)Also tried to restart zookeeper ,hdfs,mapreduce individually  but it is throwing same error.

 

Please advise me ...

 

 

 

Regards,

Sudhakar Reddy Kurakula

8 REPLIES 8

Re: Regarding Cloudera Management Services restart

Are your Cloudera Manager agents still running?

Re: Regarding Cloudera Management Services restart

Explorer

Yes,All agents ,server and db also running properly.

 

Re: Regarding Cloudera Management Services restart

Does anything look unexpected on your hosts page? Did your new hosts come back with different IP addresses or hostnames? Can you ssh into the CM server host, then try to ping the agent hosts using the same hostname reported in the CM UI?

Re: Regarding Cloudera Management Services restart

Explorer

I have the same problem. My cluster go down for not enough space reason. After I clean some unused data and tried to restart cluster and services, I got this problem everywhere:

 

Command aborted because of exception: Command timed-out after 150 seconds

 

My hosts comeback with normal IP addresses and hostnames. I also can ssh into CM server host, and able to ping the agent hosts. I think have a problem related to kerberos?

Re: Regarding Cloudera Management Services restart

Super Guru

@ducna and @KSReddy,

 

The command is timed out by Cloudera Manager since it did not receive an update from the agent regarding the successful or unsuccessful completion of that command.

 

To debug, look at the agent log to see what transpired.

If you have a traumatic event like running out of disk space, it is reasonable to expect that ALL services on that host should be restarted, especially those that have files in the volume that ran out of space.

 

I recommend capturing the agent log (/var/log/cloudera-scm-agent/cloudera-scm-agent.log) output when you try running a command.  If you can't spot the problem, share it with us.

 

It is likely you will need to restart the agent and the supervisor on that host.

If you cannot run any commands on the host (they all time out) and the agent is heartbeating to Cloudera Manager, then it may be necessary to perform a clean restart as outlined here:

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ag_agents.html

Re: Regarding Cloudera Management Services restart

Explorer

Hi @bgooley, thanks for your reply. 

 

Something to consider from my trace: 

 - Log from

/var/log/cloudera-scm-agent/cloudera-scm-agent.log: 

Heartbeating to master.... failed
.... 
SSLError: certificate verify failed

 /var/log/cloudera-scm-server/cloudera-scm-server.log:

WARN 727197460@agentServer-7397:org.mortbay.log: javax.net.ssl.SSLException: Received fatal alert: certificate_expired

 Have any idea?

Re: Regarding Cloudera Management Services restart

Expert Contributor

Hi,

 

The error data you provided @ducna pretty clearly shows what is wrong your case. The server is telling you that your certificates are no longer valid. The certificates in use on your cluster and associated host appear to be beyond they validity date and there for are expired.

 

 

javax.net.ssl.SSLException: Received fatal alert: certificate_expired
Customer Operations Engineer | Security SME | Cloudera, Inc.

Re: Regarding Cloudera Management Services restart

Explorer
Hi,

Can you give me the solution to fix it? I am using CDH 5.9 with Kerberos.
Don't have an account?
Coming from Hortonworks? Activate your account here