Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Services are not coming up

Services are not coming up

Guru

I have configured kerberos and everything was working fine for long time. But since yesterday I am not able to restart services and I can only see following error in ambari logs.

So Can someone please help me to figure out the issue ?

18 Mar 2017 10:03:48,898 ERROR [pool-9-thread-11] BaseProvider:240 - Caught exception getting JMX metrics : Connection refused, skipping same exceptions for next 5 minutes

java.net.ConnectException: Connection refused

at java.net.PlainSocketImpl.socketConnect(Native Method)

at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)

at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)

at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

at java.net.Socket.connect(Socket.java:579)

at sun.net.NetworkClient.doConnect(NetworkClient.java:175)

at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)

at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)

at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)

at sun.net.www.http.HttpClient.New(HttpClient.java:308)

at sun.net.www.http.HttpClient.New(HttpClient.java:326)

at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996)

at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932)

at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850)

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300)

at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)

at org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:209)

at org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:133)

at org.apache.ambari.server.controller.internal.URLStreamProvider.readFrom(URLStreamProvider.java:107)

at org.apache.ambari.server.controller.internal.URLStreamProvider.readFrom(URLStreamProvider.java:112)

at org.apache.ambari.server.controller.metrics.RestMetricsPropertyProvider.populateResource(RestMetricsPropertyProvider.java:226)

at org.apache.ambari.server.controller.metrics.ThreadPoolEnabledPropertyProvider$1.call(ThreadPoolEnabledPropertyProvider.java:180)

at org.apache.ambari.server.controller.metrics.ThreadPoolEnabledPropertyProvider$1.call(ThreadPoolEnabledPropertyProvider.java:178)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

18 Mar 2017 10:05:05,882 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:496 - Updating applied config on service HDFS, component JOURNALNODE, host w1.hdp22

18 Mar 2017 10:05:05,884 INFO [ambari-heartbeat-processor-0] ServiceComponentHostImpl:1047 - Host role transitioned to a new state, serviceComponentName=JOURNALNODE, hostName=w1.hdp22, oldState=STARTING, currentState=STARTED

12 REPLIES 12
Highlighted

Re: Services are not coming up

Super Mentor

@Saurabh

The mentioned error seems unrelated and can be seen in Ambari Server log sometimes. So for now we can ignore it.

 ERROR [pool-9-thread-11] BaseProvider:240 - Caught exception getting JMX metrics : Connection refused, skipping same exceptions for next 5 minutes

.

- What happens when you try starting any service from ambari UI? Do you see Operations History in Ambari UI ? Does it show successful execution of operation or timeout or error?

- What is the version of ambari?

- Are the agents running fine? When you trigger the service restart from ambari UI do you see any message reaching to ambari-agent logs showing that they accepted requests?

- Do you see any error in the components log? Like NameNode log/DataNode log when you are trying to restart the HDFS service?

Highlighted

Re: Services are not coming up

Guru

Thanks @Jay SenSharma for quick response.

I can see in ambari UI that it is generating ticket again and again and trying following checks.

_, out, err = get_user_call_output(cmd, user=self.run_user, logoutput=self.logoutput, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w '%{http_code}' -X PUT --negotiate -u : 'http://m1.hdp22:50070/webhdfs/v1/ats/done?op=SETPERMISSION&user.name=hdfs&permission=755' 1>/tmp/tmpT7uLM1 2>/tmp/tmpiNivOT' returned 7. curl: (7) couldn't connect to host
401
Highlighted

Re: Services are not coming up

Guru

Also ambari is taking lot of time to give status of any service. It is just going in gray color and I feel which means waiting for something. So is there anyway to check why it is taking long time.

Highlighted

Re: Services are not coming up

Guru
2017-03-18 10:32:18,023 - Skipping the operation for not managed DFS directory /tmp since immutable_paths contains it.
2017-03-18 10:32:18,024 - HdfsResource['/tmp/entity-file-history/active'] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'dfs_type': '', 'default_fs': 'hdfs://TESTHA', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs-hdp22@HADOOPADMIN.COM', 'user': 'hdfs', 'owner': 'yarn', 'group': 'hadoop', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/tmp', u'/app-logs', u'/mr-history/done']}
2017-03-18 10:32:18,025 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-hdp22@HADOOPADMIN.COM'] {'user': 'hdfs'}

2017-03-18 10:32:53,549 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://m1.hdp22:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpHSYkXh 2>/tmp/tmpuY87Ud''] {'quiet': False}
Highlighted

Re: Services are not coming up

Super Mentor

@Saurabh

As you mentioned that ambari is responding very slow so in that case taking a look at the thread dump will be really useful. Do you see any stuck/blocked thread in ambari-server thread dump? You can collect the thread dump using the script mentioned in the article: https://community.hortonworks.com/articles/72319/how-to-collect-threaddump-using-jcmd-and-analyse-i.....

- Also please check if there is enough free memory available. Also please run the command to findout AmbariServer memory stats:

# /usr/jdk64/jdk1.8.0_60/bin/jmap -heap 2797
# free -m

- Have you recently restarted ambari-agents?

- Are you able to see the port is opened and accessible?

# telnet m1.hdp22 50070

.

- Also are you able to do kinit properly and able to get the kerberos ticket properly?

Example:

# kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-JoyCluster@EXAMPLE.COM   
# klist

.

Highlighted

Re: Services are not coming up

Guru

Still I see the same ticket generate and service checks.

2017-03-18 10:47:19,228 - HdfsResource['/hdp/apps/2.3.4.0-3485/slider/slider.tar.gz'] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'source': '/usr/hdp/2.3.4.0-3485/slider/lib/slider.tar.gz', 'dfs_type': '', 'default_fs': 'hdfs://TESTHA', 'replace_existing_files': False, 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs-hdp22@HADOOPADMIN.COM', 'user': 'hdfs', 'owner': 'hdfs', 'group': 'hadoop', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'file', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/tmp', u'/app-logs', u'/mr-history/done'], 'mode': 0444}
2017-03-18 10:47:19,229 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-hdp22@HADOOPADMIN.COM'] {'user': 'hdfs'}
2017-03-18 10:47:53,052 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://m1.hdp22:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpoC7TzQ 2>/tmp/tmpMKSZXy''] {'quiet': False}
2017-03-18 10:48:25,072 - call returned (0, '')
2017-03-18 10:48:25,072 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://m2.hdp22:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmp3GWTGJ 2>/tmp/tmpay6Rih''] {'quiet': False}
Highlighted

Re: Services are not coming up

Guru

mapred-mapred-historyserver-m2hdp22log.zip

Now History server could not start and failed. I have attached the logs.

Highlighted

Re: Services are not coming up

Super Mentor

@Saurabh

Do you have sufficient resources (Memory) available on the host where the History Server is running?

I see the SIGTERM 15 signal in your logs which looks normal. Apart from that i do not see any additional error in your Todays log '2017-03-18'

2017-03-18 08:18:37,662 ERROR hs.JobHistoryServer (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM

.

Highlighted

Re: Services are not coming up

Guru

@Jay SenSharma : I have noticed one thing actually I have integrated kerberos with AD and I disabled kerberos client managed by Ambari, and I feel this was the reason it was going and going for kerberos authentication checks for every service.

For testing purpose I have again enabled it again and then restarted services then it completed in 312 seconds.

So do you also see it root cause ?

Don't have an account?
Coming from Hortonworks? Activate your account here