Member since
11-12-2015
22
Posts
1
Kudos Received
0
Solutions
07-28-2016
08:32 AM
Thank you. I have updated to the latest HDP 2.4 but that might be a newer version when I do the ubuntu update. Currently, I am running Ubuntu 14.04. It seems like the best solution would be to disable the hortonworks repo during the upgrade and then enable it again.
... View more
07-14-2016
07:38 AM
Hi, I have just upgraded to a HDP-2.4.2 and Ambari 2.2.2.0. The cluster is installed on 6 nodes running Ubuntu 14.04 server. I want to update the Ubuntu packages (apt-get dist-upgrade), without updating/modifying the HDP packages. What should I consider when doing this? Do I have to remove the hortonworks repo during the upgrade and add it later? Are there any side effects that must be taken into account?
... View more
Labels:
07-11-2016
02:40 PM
Hi @swagle, Thank you very much for the support. After several retries I managed to delete the service and install it again on another host. It worked, without me doing much else than before, I just had to set the zookeeper.znode.parent to the HBase value. Really don't know why it worked this time.
... View more
07-09-2016
10:29 AM
Hi Geoffrey, I reinstalled the service in ambari but it hanged while trying to start the metrics collector again. This is the error from the log file: 2016-07-07 01:42:14,211 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
2016-07-07 01:42:14,212 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
... View more
07-09-2016
10:29 AM
Hi Geoffrey, Tried the doc and reinstalling the service, but it hangs while starting the metrics collector again. This is from the log file at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2016-07-07 01:42:14,211 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
2016-07-07 01:42:14,212 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2016-07-07 01:42:15,309 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
2016-07-07 01:42:15,311 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
... View more
07-09-2016
10:29 AM
hbase-site.xml Hi, I changed the port to 61181 by it is able to connect. I see no service running on port 61181. The following messages in the log: 2016-07-06 18:37:42,385 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.tmaut.tlabsdata.com/172.16.164.131:61181. Will not attempt to authenticate using SASL (unknown error)
2016-07-06 18:37:42,386 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.tmaut.tlabsdata.com/172.16.164.131:61181. Will not attempt to authenticate using SASL (unknown error)
2016-07-06 18:37:42,386 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2016-07-06 18:37:42,387 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) also attached the hbase-site.xml
... View more
07-08-2016
03:42 PM
Also moving ambari-metrics-collector to another host fails in the wizard with the following error: stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 165, in <module>
AMSServiceCheck().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 216, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 92, in service_check
raise Fail("Metrics were not saved. Service check has failed. "
resource_management.core.exceptions.Fail: Metrics were not saved. Service check has failed.
Connection failed.
stdout:
2016-07-08 15:41:07,832 - Ambari Metrics service check was started.
2016-07-08 15:41:07,844 - Generated metrics:
{
"metrics": [
{
"metricname": "AMBARI_METRICS.SmokeTest.FakeMetric",
"appid": "amssmoketestfake",
"hostname": "w1.domain",
"timestamp": 1467992467000,
"starttime": 1467992467000,
"metrics": {
"1467992467000": 0.113469705131,
"1467992468000": 1467992467000
}
}
]
}
2016-07-08 15:41:07,844 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:17,856 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:17,857 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:27,867 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:27,867 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:37,878 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:37,878 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:47,891 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:47,892 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:57,904 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:57,905 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:07,919 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:07,919 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:17,929 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:17,930 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:27,941 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:27,942 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:37,956 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:37,956 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
... View more
07-08-2016
12:51 PM
Found wrong rootdir hostname, after that I am getting 2016-07-08 12:44:39,320 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.domain/172.16.164.131:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-08 12:44:39,321 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to m2.domain/172.16.164.131:2181, initiating session
2016-07-08 12:44:39,328 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server m2.domain/172.16.164.131:2181, sessionid = 0x255ca408b8d0063, negotiated timeout = 40000
2016-07-08 12:44:50,376 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:07,243 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:16,166 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:32,517 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:54,803 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:46:10,720 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:46:37,467 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:47:01,600 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:47:01,600 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=142264 ms ago, cancelled=false, msg=
... View more
07-08-2016
11:59 AM
1 Kudo
So I removed the ambari-metrics service and added it again (moving to another node didn't work). I also made some changes:
- switch to distributed mode
- modified zookeeper.znode.parent=/hbase-secure
- manually recreated ams.collector.keytab and zk.service.keytab due to authentication errors in the log
- changed hbase.
zookeeper.property.clientPort to 2181 from 61181
- changed rootdir from local to HDFS
I think I am getting cluse as AMS can connect to zookeeper:
INFO org.apache.phoenix.query.ConnectionQueryServicesImpl: Successfull login to secure cluster!!
However I am getting error connecting to HBase
WARN org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.query.DefaultPhoenixDataSource: Unable to connect to HBase store using Phoenix.
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
I'll attached the logs in a comment
... View more