Support Questions

Find answers, ask questions, and share your expertise

cloudera-scm-server is dead and pid file exists

avatar
Explorer

Hi everyone, i was tring to install cloudera manager following the tutorial "www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-2/Cloudera-Manager-Installa...

 

After installation, i was not able to open cloudera manager at "localhost:7180". I tried to restart the cloudera-scm-manager using the command "service cloudera-scm-server restart" howerev,, it stoped after a few seconds and gave the folllowing error message "cloudera-scm-server is dead and pid file exists"

I tried to check the log file of cloudera-scm-server but nothing seems wrong. Anybody have a solution for this problem? Thanks in advance.

 

 

1 ACCEPTED SOLUTION

avatar
Master Guru

Hi,

 

First, since you are installing, you are probably using Cloudera Manager 5.2.1 (latest), so the documentation you want is here:

 

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/installation_installati...

 

Since there is a .pid file, remove it first.  Check the following and delete if it exists:

 

        /var/run/cloudera-scm-server.pid

 

Next, if there is a problem starting, there are likely valuable clues in the logs.  Check out /var/log/cloudera-scm-server/cloudera-scm-server.log

 

Perhaps tail -f while starting to see if there are any exceptions.

 

If the solution is not obvious, post the details and someone here can help.

View solution in original post

18 REPLIES 18

avatar
Master Guru

Hi,

 

First, since you are installing, you are probably using Cloudera Manager 5.2.1 (latest), so the documentation you want is here:

 

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/installation_installati...

 

Since there is a .pid file, remove it first.  Check the following and delete if it exists:

 

        /var/run/cloudera-scm-server.pid

 

Next, if there is a problem starting, there are likely valuable clues in the logs.  Check out /var/log/cloudera-scm-server/cloudera-scm-server.log

 

Perhaps tail -f while starting to see if there are any exceptions.

 

If the solution is not obvious, post the details and someone here can help.

avatar
Explorer

thank you so much for the advice. I faced quite a lot difficulties using ubuntu, so I reinstall red hat and following the path you gave me, the problem no longer exists. Thanks again

avatar
New Contributor

Hi,

I've faced the same issue except the fact I am nott able to resolve it in any way. I hit the issue when I upgraded the ClouderaManager from 5.3.2 to latest (5.4.3).

I followed the procedure described at http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_ag_upgrade_cm5.html and

after the upgrade I am not able to start the cloudera-scm-server. It starts and when I check its status it says "cloudera-scm-server dead but pid

file exists". I remove "/var/run/cloudera-scm-server.pid" and try to start the server again and then it starts, and if I check the status

- "cloudera-scm-server dead but pid file exists".

I checked the cloudera-scm-server-db is running and I was able to connect with "psql -U scm -p 7432 -h localhost" and all the data is there. The

cloudera-scm-agent is also running without any issue. I cannot diagnose what's causing the issues since there's isnt't even a single log entry neither in

/var/log/cloudera-scm-server/cloudera-scm-server.out, nor in /var/log/cloudera-scm-server/cloudera-scm-server.log. I run tail -f for both of the files

and not a single line during server start.

I would be grateful if someone can help me since I am stuck in the middle of an upgrade.

 

avatar
New Contributor

I solved the issue. The logs weren't showing because the SELinux is enabled  - I don't know where's the relation between them. Once I enabled them I saw the reason for hte failure. In my previous attempts to run the server I have run it as root directly from a low level script (that doesn't set java memory correctly) and it failed it OutOfMemory. This caused creation of a *.hprof file owned by root. Then on the consecutive try to run the server it wasn't able to read the hprof file (again why the server need to read the heap dump?!) and that's why it was failing. The solution was to move/remove the *.hprof from /usr/share/cmf

avatar
Rising Star

If the above step doesn't fix your issue then the postgresql  is not running to identitfy it try to run command

 

service postgresql  restart

 

if above command fails then you have to check in /etc/hosts if loopback address is missing 127.0.0.1 localhost

 

Add line 127.0.0.1 localhost save it, then try starting the postgresql service

 

service postgresql  start

 

Hope this will help

 

Guruveer

avatar
New Contributor

Followed these very steps - and in our case, nothing at all wrong with localhost.  But, in our case, postgresql refuses to start, claiming in the logs "cannot resolve localhost" which is BS.  Can ping and can ssh localhost.

avatar
New Contributor
LOG: could not translate host name "localhost", service "5432" to address: Name or service not know.

avatar
Master Guru

@ChrisEns

 

I searched for that error in Google and there are quite a few hits for that exact error and a few different tips and solutions.

Usually the issue is due to a malformed /etc/hosts file.  That would be a good place to check first... make sure you have at least:

 

127.0.0.1   localhost

 

and nothing else "odd"

 

Regards,

 

Ben

avatar
Explorer

Thanks for reply..

 

i do also having same problem. but as you asvised i removed the file /var/run/cloudera-scm-server.pid

 

then i start the cloudera service and again same issue but that removed file automatically regenarated again.

 

[root@cdh1 ~]# service cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]
[root@cdh1 ~]# service cloudera-scm-server status
cloudera-scm-server dead but pid file exists
[root@cdh1 ~]# rm /var/run/cloudera-scm-server.pid
rm: remove regular file `/var/run/cloudera-scm-server.pid'? y
[root@cdh1 ~]#