Member since
06-10-2016
30
Posts
4
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1804 | 01-23-2018 11:27 PM | |
2138 | 10-30-2017 08:23 PM | |
2116 | 02-24-2017 07:15 PM | |
2057 | 12-11-2016 11:07 PM | |
4551 | 09-01-2016 09:35 PM |
01-23-2018
11:27 PM
I solved it following this instructions: https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data
... View more
01-23-2018
11:27 PM
Hello, the host where NameNode and AMS' services run was filled up. I solved it but then AMS Collector doesn't start. This is the AMS Collector's error message: /var/log/ambari-metrics-collector/hbase-ams-master-hw.example.com.out 2018-01-23 10:29:01,077 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hw.example.com:61181 sessionTimeout=120000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@c540f5a
2018-01-23 10:29:01,095 INFO [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error)
2018-01-23 10:29:01,114 WARN [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2018-01-23 10:29:02,222 INFO [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error)
2018-01-23 10:29:02,222 WARN [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2018-01-23 10:29:02,324 WARN [main] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=hw.example.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master
2018-01-23 10:29:02,324 ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after 1 attempts
2018-01-23 10:29:02,324 WARN [main] zookeeper.ZKUtil: clean znode for master0x0, quorum=hw.example.com:61181, baseZNode=/ams-hbase-secure Unable to get data of znode /ams-hbase-secure/master
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714)
at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267)
at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838)
2018-01-23 10:29:02,325 ERROR [main] zookeeper.ZooKeeperWatcher: clean znode for master0x0, quorum=hw.example.com:61181, baseZNode=/ams-hbase-secure Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714)
at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267)
at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838)
2018-01-23 10:29:02,325 WARN [main] zookeeper.ZooKeeperNodeTracker: Can't get or delete the master znode
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714)
at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267)
at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838)
/var/log/ambari-metrics-collector/ambari-metrics-collector.log 2018-01-23 10:29:01,191 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error)
2018-01-23 10:29:01,192 WARN org.apache.zookeeper.ClientCnxn: Session 0x16123a0fb540000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
2018-01-23 10:29:01,298 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=hw.example.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Conn
ectionLoss for /ams-hbase-secure/meta-region-server
2018-01-23 10:29:02,339 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error)
2018-01-23 10:29:02,340 WARN org.apache.zookeeper.ClientCnxn: Session 0x16123a0fb540000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) There aren't services listen on ports 6188 and 61181. I've configured the HBase's ticktime "hbase.zookeeper.property.tickTime = 6000". Thanks in advance.
... View more
Labels:
10-30-2017
08:23 PM
I solved it with a comment on Spark2, Config, Advanced spark2-env. After that, I restarted Spark2 and their clients and the new configuration files were deployed.
... View more
10-26-2017
05:29 PM
Hi @Aditya Sirna, The directory /usr/hdp/2.6.2.0-205/spark2/conf/ is empty but I have installed spark2_2_6_2_0_205-python-2.1.1.2.6.2.0-205.noarch spark2_2_6_2_0_205-2.1.1.2.6.2.0-205.noarch
... View more
10-26-2017
04:20 PM
Hi, I just installed Spark2 from Ambari wizard and the Spark2's configuration directory is empty: > ls -l /etc/spark2/2.6.2.0-205/0
total 0
The installation output is: 14:39:22,875 - Backing up /etc/spark2/conf to /etc/spark2/conf.backup if destination doesn't exist already.
14:39:22,875 - Execute[('cp', '-R', '-p', '/etc/spark2/conf', '/etc/spark2/conf.backup')] {'not_if': 'test -e /etc/spark2/conf.backup', 'sudo': True}
14:39:22,897 - Checking if need to create versioned conf dir /etc/spark2/2.6.2.0-205/0
14:39:22,900 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'dry-run-create', '--package', 'spark2', '--stack-version', u'2.6.2.0-205', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
14:39:22,940 - call returned (0, '/etc/spark2/2.6.2.0-205/0', '')
14:39:22,941 - Package spark2 will have new conf directories: /etc/spark2/2.6.2.0-205/0
14:39:22,946 - Checking if need to create versioned conf dir /etc/spark2/2.6.2.0-205/0
14:39:22,952 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'spark2', '--stack-version', u'2.6.2.0-205', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
14:39:22,987 - call returned (1, '/etc/spark2/2.6.2.0-205/0 exist already', '')
14:39:22,988 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'spark2', '--stack-version', u'2.6.2.0-205', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
14:39:23,022 - checked_call returned (0, '/usr/hdp/2.6.2.0-205/spark2/conf -> /etc/spark2/2.6.2.0-205/0')
14:39:23,023 - Ensuring that spark2 has the correct symlink structure
14:39:23,024 - Execute[('cp', '-R', '-p', '/etc/spark2/conf', '/etc/spark2/conf.backup')] {'not_if': 'test -e /etc/spark2/conf.backup', 'sudo': True}
14:39:23,033 - Skipping Execute[('cp', '-R', '-p', '/etc/spark2/conf', '/etc/spark2/conf.backup')] due to not_if
14:39:23,034 - Directory['/etc/spark2/conf'] {'action': ['delete']}
14:39:23,034 - Removing directory Directory['/etc/spark2/conf'] and all its content
14:39:23,035 - Link['/etc/spark2/conf'] {'to': '/etc/spark2/conf.backup'}
14:39:23,035 - Creating symbolic Link['/etc/spark2/conf'] to /etc/spark2/conf.backup
14:39:23,036 - Link['/etc/spark2/conf'] {'action': ['delete']}
14:39:23,036 - Deleting Link['/etc/spark2/conf']
14:39:23,037 - Link['/etc/spark2/conf'] {'to': '/usr/hdp/current/spark2-client/conf'}
14:39:23,037 - Creating symbolic Link['/etc/spark2/conf'] to /usr/hdp/current/spark2-client/conf
14:39:23,037 - /etc/hive/conf is already linked to /etc/hive/2.6.2.0-205/0 I'm using Ambari version 2.5.2.0, HDP version 2.6.2.0-205 and Spark2 version 2.1.1. Do you know what happened? There is a way to install again the Spark2 configuration? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Spark
09-17-2017
04:20 AM
Hello, following the documentation for upgrading to Ambari 2.5.2 I stuck on this line Record the location of the Metrics Collector component before you begin the upgrade process. What does it means? Does it refers to the path of the Metrics Collector's database? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Ambari
03-30-2017
07:57 PM
I tried in YARN API and I got this error message [yarn@foo ~]$ curl -v -X PUT -d '{"state": "KILLED"}' 'http://foo.example.com:8088/ws/v1/cluster/apps/application_1487024494103_0099'
* About to connect() to foo.example.com port 8088 (#0)
* Trying 192.168.1.1...
* Connected to foo.example.com (192.168.1.1) port 8088 (#0)
> PUT /ws/v1/cluster/apps/application_1487024494103_0099 HTTP/1.1
> User-Agent: curl/7.29.0
> Host: foo.example.com:8088
> Accept: */*
> Content-Length: 19
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 19 out of 19 bytes
< HTTP/1.1 500 Internal Server Error
< Cache-Control: no-cache
< Expires: Thu, 30 Mar 2017 19:51:36 GMT
< Date: Thu, 30 Mar 2017 19:51:36 GMT
< Pragma: no-cache
< Expires: Thu, 30 Mar 2017 19:51:36 GMT
< Date: Thu, 30 Mar 2017 19:51:36 GMT
< Pragma: no-cache
< Content-Type: application/json
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26.hwx)
<
* Connection #0 to host foo.example.com left intact
{"RemoteException":{"exception":"WebApplicationException","javaClassName":"javax.ws.rs.WebApplicationException"}}
... View more
03-16-2017
02:47 PM
I'm trying to kill an application in YARN but I get the message "Waiting for application ID to be killed". There is a way to kill it fast? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
02-24-2017
07:15 PM
I found the problem: the device that was filled up has this file /var/lib/ambari-agent/data/structured-out-status.json that differs with the other nodes. I following this steps as root rm -f /var/lib/ambari-agent/data/structured-out-status.json
ambari-agent restart
And I deleted the PID files in /var/run for applications that aren't responding for restart (like Zookeeper and Ambari Metrics Collector). After that Ambari shows the process down. So I started them and now it works correctly.
... View more
02-24-2017
06:24 PM
An application filled up the HDD and after the cleaning the log is corrupted (there are the last five lines) 2017/02/24 05:30:15 [I] Completed XXX.XXX.XXX.XXX - "GET / HTTP/1.1" 500 Internal Server Error 2528 bytes in 26900us
2017/02/24 05:31:15 [I] Completed XXX.XXX.XXX.XXX - "GET / HTTP/1.1" 500 Internal Server Error 2528 bytes in 14789us
2017/02/24 05:32:15 [I] Completed XXX.XXX.XXX.XXX - "GET / HTTP/1.1" 500 Internal Server Error 2528 bytes in 20252us
2017/02/24 05:33:15 [I] Completed XXX.XXX.XXX.XXX - "GET / HTTP/1.1" 500 Internal Server Error 2528 bytes in 16111us
2017/02
... View more