Member since
05-31-2016
23
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1508 | 10-25-2017 01:35 PM |
10-25-2017
01:38 PM
Ok, solved, it was indeed a permission problem. But this should be considered a bug nonetheless. The message doesn't tell anything about the issue PLUS it starts an automated loop of error box -> logging out -> logging in -> error box with no options to jump out of it but waiting for the session to expire with the browser tab closed
... View more
10-25-2017
01:35 PM
The only "strange" thing I see is in the ambari-audit.log, about my user (vide): 2017-10-25T12:23:50.729+0200, User(vide), RemoteIp(192.168.150.13), Operation(Request from server), RequestType(POST), url(http://ambari-data.billy.preprod/api/v1/requests), ResultStatus(403 Forbidden), Reason(The authenticated user is not authorized to execute the action check_host.), Command(null), Cluster name(null)
I should be in the cluster admin group..I'll try again with the local admin user...
... View more
10-25-2017
01:29 PM
@Jay SenSharma No, I'm trying to ADD the hosts to the cluster, they are currently not present. Screenshot: Ambari-metrics is installed but it hasn't run yet because the host is still not in any ambari cluster. ambari-agent.log (DEBUG enabled): INFO 2017-10-25 15:25:42,947 DataCleaner.py:39 - Data cleanup thread started
INFO 2017-10-25 15:25:42,949 DataCleaner.py:120 - Data cleanup started
INFO 2017-10-25 15:25:42,949 DataCleaner.py:122 - Data cleanup finished
INFO 2017-10-25 15:25:42,971 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2017-10-25 15:25:42,972 main.py:181 - Newloglevel=logging.DEBUG
INFO 2017-10-25 15:25:42,972 main.py:436 - Connecting to Ambari server at https://ambari-data.billy.preprod:8440 (192.168.40.120)
DEBUG 2017-10-25 15:25:42,973 NetUtil.py:110 - Trying to connect to https://ambari-data.billy.preprod:8440
INFO 2017-10-25 15:25:42,973 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/ca
DEBUG 2017-10-25 15:25:43,064 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/ca -> 200, body:
INFO 2017-10-25 15:25:43,064 main.py:446 - Connected to Ambari server ambari-data.billy.preprod
DEBUG 2017-10-25 15:25:43,064 Controller.py:66 - Initializing Controller RPC thread.
INFO 2017-10-25 15:25:43,065 threadpool.py:58 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:280 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:175 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1779cd0>; currently running: False
DEBUG 2017-10-25 15:25:43,066 scheduler.py:574 - Scheduler started
DEBUG 2017-10-25 15:25:43,066 scheduler.py:579 - Looking for jobs to run
DEBUG 2017-10-25 15:25:43,066 scheduler.py:599 - No jobs; waiting until a job is added
INFO 2017-10-25 15:25:45,085 hostname.py:98 - Read public hostname 'druid-co01.billy.preprod' using socket.getfqdn()
INFO 2017-10-25 15:25:45,127 Hardware.py:174 - Some mount points were ignored: /dev/shm, /run, /sys/fs/cgroup, /run/user/0
DEBUG 2017-10-25 15:25:45,296 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2017-10-25 15:25:45,299 Controller.py:170 - Registering with druid-co01.billy.preprod (192.168.40.52) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "billy.preprod", "physicalprocessorcount": 4, "kernelrelease": "3.10.0-693.5.2.el7.x86_64", "uptime_days": "0", "memorytotal": 8002256, "swapfree": "0.00 GB", "memorysize": 8002256, "osfamily": "redhat", "swapsize": "0.00 GB", "processorcount": 4, "netmask": "255.255.255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 6255416, "operatingsystem": "centos", "kernelmajversion": "3.10", "kernelversion": "3.10.0", "macaddress": "00:1A:4A:16:01:AD", "operatingsystemrelease": "7.4.1708", "ipaddress": "192.168.40.52", "hostname": "druid-co01", "uptime_hours": "5", "fqdn": "druid-co01.billy.preprod", "id": "root", "architecture": "x86_64", "selinux": false, "mounts": [{"available": "5313004", "used": "3064340", "percent": "37%", "device": "/dev/vda1", "mountpoint": "/", "type": "xfs", "size": "8377344"}, {"available": "3978004", "used": "0", "percent": "0%", "device": "devtmpfs", "mountpoint": "/dev", "type": "devtmpfs", "size": "3978004"}], "hardwaremodel": "x86_64", "uptime_seconds": "18331", "interfaces": "eth0,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.5.0.3", "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937945297, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 11784, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 11886, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "timestamp": 1508937945235, "hostname": "druid-co01.billy.preprod", "responseId": -1, "publicHostname": "druid-co01.billy.preprod"}')
INFO 2017-10-25 15:25:45,300 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/connection_info
DEBUG 2017-10-25 15:25:45,385 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/connection_info -> 200, body: {"security.server.two_way_ssl":"false"}
DEBUG 2017-10-25 15:25:45,385 security.py:52 - Server two-way SSL authentication required: False
INFO 2017-10-25 15:25:45,385 security.py:93 - SSL Connect being called.. connecting to the server
INFO 2017-10-25 15:25:45,516 security.py:60 - SSL connection established. Two-way SSL authentication is turned off on the server.
DEBUG 2017-10-25 15:25:45,606 Controller.py:177 - Registration response is {u'agentConfig': {u'agent.auto.cache.update': u'true',
u'agent.check.mounts.timeout': u'0',
u'agent.check.remote.mounts': u'false'},
u'exitstatus': 0,
u'response': u'OK',
u'responseId': 0,
u'statusCommands': []}
INFO 2017-10-25 15:25:45,606 Controller.py:196 - Registration Successful (response id = 0)
INFO 2017-10-25 15:25:45,606 AmbariConfig.py:316 - Updating config property (agent.check.remote.mounts) with value (false)
INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.auto.cache.update) with value (true)
INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.check.mounts.timeout) with value (0)
DEBUG 2017-10-25 15:25:45,607 Controller.py:205 - Updated config:<AmbariConfig.AmbariConfig instance at 0x174bef0>
DEBUG 2017-10-25 15:25:45,607 Controller.py:212 - Got status commands on registration.
DEBUG 2017-10-25 15:25:45,607 Controller.py:256 - No status commands received from ambari-data.billy.preprod
WARNING 2017-10-25 15:25:45,607 AlertSchedulerHandler.py:123 - There are no alert definition commands in the heartbeat; unable to update definitions
INFO 2017-10-25 15:25:45,607 Controller.py:512 - Registration response from ambari-data.billy.preprod was OK
INFO 2017-10-25 15:25:45,607 Controller.py:517 - Resetting ActionQueue...
INFO 2017-10-25 15:25:55,619 Controller.py:304 - Heartbeat (response id = 0) with server is running...
INFO 2017-10-25 15:25:55,619 Controller.py:311 - Building heartbeat message
DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:83 - Building Heartbeat: {responseId = 0, timestamp = 1508937955620, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1}
DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:86 - Heartbeat: {'componentStatus': [],
'hostname': 'druid-co01.billy.preprod',
'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'},
'recoveryReport': {'summary': 'DISABLED'},
'recoveryTimestamp': -1,
'reports': [],
'responseId': 0,
'timestamp': 1508937955620}
INFO 2017-10-25 15:25:55,621 Heartbeat.py:90 - Adding host info/state to heartbeat message.
DEBUG 2017-10-25 15:25:55,733 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2017-10-25 15:25:55,770 Hardware.py:174 - Some mount points were ignored: /, /dev, /dev/shm, /run, /sys/fs/cgroup, /run/user/0
DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:100 - agentEnv: {'transparentHugePage': '', 'hostHealth': {'agentTimeStampAtReporting': 1508937955734, 'activeJavaProcs': [{'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord', 'pid': 12022, 'hadoop': True, 'user': 'root'}, {'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator', 'pid': 12023, 'hadoop': True, 'user': 'root'}], 'liveServices': [{'status': 'Healthy', 'name': 'ntpd or chronyd', 'desc': ''}]}, 'reverseLookup': True, 'alternatives': [], 'umask': '18', 'firewallName': 'iptables', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': True}
DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:101 - mounts: []
INFO 2017-10-25 15:25:55,771 Controller.py:318 - Sending Heartbeat (id = 0): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937955620, "hostname": "druid-co01.billy.preprod", "responseId": 0, "reports": [], "mounts": [], "recoveryTimestamp": -1, "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937955734, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 12022, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 12023, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []}
INFO 2017-10-25 15:25:55,775 Controller.py:332 - Heartbeat response received (id = 1)
INFO 2017-10-25 15:25:55,775 Controller.py:341 - Heartbeat interval is 10 seconds
INFO 2017-10-25 15:25:55,775 Controller.py:377 - Updating configurations from heartbeat
INFO 2017-10-25 15:25:55,775 Controller.py:386 - Adding cancel/execution commands
DEBUG 2017-10-25 15:25:55,775 Controller.py:246 - No commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:25:55,775 Controller.py:256 - No status commands received from ambari-data.billy.preprod
INFO 2017-10-25 15:25:55,775 Controller.py:403 - Adding recovery commands
DEBUG 2017-10-25 15:25:55,775 Controller.py:422 - No commands sent from ambari-data.billy.preprod
INFO 2017-10-25 15:25:55,775 Controller.py:471 - Waiting 9.9 for next heartbeat
INFO 2017-10-25 15:26:05,677 Controller.py:478 - Wait for next heartbeat over
DEBUG 2017-10-25 15:26:05,677 Controller.py:304 - Heartbeat (response id = 1) with server is running...
DEBUG 2017-10-25 15:26:05,677 Controller.py:311 - Building heartbeat message
DEBUG 2017-10-25 15:26:05,678 Heartbeat.py:83 - Building Heartbeat: {responseId = 1, timestamp = 1508937965678, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1}
DEBUG 2017-10-25 15:26:05,679 Heartbeat.py:86 - Heartbeat: {'componentStatus': [],
'hostname': 'druid-co01.billy.preprod',
'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'},
'recoveryReport': {'summary': 'DISABLED'},
'recoveryTimestamp': -1,
'reports': [],
'responseId': 1,
'timestamp': 1508937965678}
DEBUG 2017-10-25 15:26:05,680 Controller.py:318 - Sending Heartbeat (id = 1): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937965678, "hostname": "druid-co01.billy.preprod", "responseId": 1, "reports": [], "recoveryTimestamp": -1, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []}
DEBUG 2017-10-25 15:26:05,683 Controller.py:332 - Heartbeat response received (id = 2)
DEBUG 2017-10-25 15:26:05,683 Controller.py:341 - Heartbeat interval is 10 seconds
DEBUG 2017-10-25 15:26:05,683 Controller.py:377 - Updating configurations from heartbeat
DEBUG 2017-10-25 15:26:05,684 Controller.py:386 - Adding cancel/execution commands
DEBUG 2017-10-25 15:26:05,684 Controller.py:246 - No commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:256 - No status commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:403 - Adding recovery commands
DEBUG 2017-10-25 15:26:05,684 Controller.py:422 - No commands sent from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:471 - Waiting 9.9 for next heartbeat
And the last part of the server log: 25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:103 - Request https://ambari-data.billy.preprod:8440/ca doesn't match any pattern.
25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:62 - This request is not allowed on this port: https://ambari-data.billy.preprod:8440/ca
25 Oct 2017 15:25:45,570 INFO [qtp-ambari-agent-36] HeartBeatHandler:425 - agentOsType = centos7
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] HostImpl:329 - Received host registration, host=[hostname=druid-co01,fqdn=druid-co01.billy.preprod,domain=billy.preprod,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=7.4.1708,osfamily=redhat,memory=8002256,uptime_hours=5,mounts=(available=5313004,mountpoint=/,used=3064340,percent=37%,size=8377344,device=/dev/vda1,type=xfs)(available=3978004,mountpoint=/dev,used=0,percent=0%,size=3978004,device=devtmpfs,type=devtmpfs)]
, registrationTime=1508937945570, agentVersion=2.5.0.3
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:548 - TopologyManager.onHostRegistered: Entering
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:602 - Host druid-co01.billy.preprod re-registered, will not be added to the available hosts list
... View more
10-25-2017
10:42 AM
Hello, using Ambari Server 2.5.0.3-7 and Agent 2.5.0.3 and I get this error when manually adding a new host to an Ambari Cluster. The agent was preinstalled and preconfigured and the new host is running CentOS 7.4 No useful info (warning, error) in either agent or server logs.
... View more
Labels:
- Labels:
-
Apache Ambari
07-11-2017
01:23 PM
Hello @Michael Dennis "MD" Uanang can you confirm that this worked on a 3-nodes Zookeeper install? I mean, I need to move my zk cluster from the original 3 hosts to other 3 hosts, will this work repeating 3 times the same procedure?
... View more
03-08-2017
11:16 AM
Thank you very much @Roland Simonis
I had the same problem (reinstalled a slave from scratch, and decommisioned via Ambari both DN and NM) but Ambari doesn't have a GUI option for recommissioning the NM, so the ResourceManager was always denying access. With the API call you posted I easily recommissioned the NM and now everything is working again as expected. Thanks!
... View more
03-01-2017
04:24 PM
Hello how can I install HDP 2.5.3 with CloudBreak latest release 1.6.3? I'm using a blueprint for Ambari with HDP: 2.5 but it installs 2.5.0 and not the latest HDP revision
... View more
Labels:
02-16-2017
05:03 PM
This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3
... View more
02-16-2017
05:02 PM
No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds
... View more
02-16-2017
04:35 PM
@Eugene Koifman @Wei Zheng
this seems related to https://issues.apache.org/jira/browse/HIVE-15142 do you have any idea about our problem?
... View more
02-16-2017
04:23 PM
Thanks! It worked! I put the in Ambari in custom-hive-site, restarted the affected services and no more noise.
... View more
02-16-2017
01:04 PM
Hello Just upgraded to HDP 2.5.3 from HDP 2.3.6 and experiencing lots of problems. One for instance is that the 3 Hive metastores that I've running for HA sake are printing this to the metastore.log like crazy every few milliseconds: 2017-02-16 13:58:01,341 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2960)) - Aborted 0 transactions due to timeout
2017-02-16 13:58:01,345 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2949)) - Aborted the following transactions due to timeout: []
2017-02-16 13:58:01,345 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2960)) - Aborted 0 transactions due to timeout
2017-02-16 13:58:01,350 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2949)) - Aborted the following transactions due to timeout: []
2017-02-16 13:58:01,350 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2960)) - Aborted 0 transactions due to timeout
2017-02-16 13:58:01,354 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2949)) - Aborted the following transactions due to timeout: []
2017-02-16 13:58:01,354 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2960)) - Aborted 0 transactions due to timeout
2017-02-16 13:58:01,359 INFO [org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService-0]: txn.TxnHandler (TxnHandler.java:performTimeOuts(2949)) - Aborted the following transactions due to timeout: []
I cannot find anything pointing to the cause of this problem. Any hint?
... View more
Labels:
02-16-2017
10:56 AM
@Arif Hossain did you manage to fix the compaction problem? How? I have the same problem on a new partition after upgrading to HDP 2.5.3 from 2.3.6 and it's not a permission problem as in @Benjamin Hopp case
... View more
01-19-2017
09:19 AM
Hello @Santhosh B Gowda we have fixed it by deleting the whole /storm path in Zookeeper + /var/hadoop/storm in the Nimbus hosts, and then deployed again the topologies. The only drawback is that we had to stop all the topologies for some minutes, with a minor downtime. Thanks for the help
... View more
01-18-2017
09:02 AM
1 Kudo
Hello yesterday we upgraded our Ambari installation from 2.2.2.0 to 2.4.2.0. Ambari is managing a HDP 2.3.6 cluster. After the upgrade (following all these instructions http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-upgrade/content/upgrade_ambari.html) Storm Nimbus crashes on start with this exception: 2017-01-17 19:39:29.871 b.s.zookeeper [INFO] Accepting leadership, all active topology found localy.
2017-01-17 19:39:29.928 b.s.d.nimbus [INFO] Starting Nimbus server...
2017-01-17 19:39:30.860 b.s.d.nimbus [ERROR] Error when processing event
java.lang.NullPointerException
at clojure.lang.Numbers.ops(Numbers.java:961) ~[clojure-1.6.0.jar:?]
at clojure.lang.Numbers.isZero(Numbers.java:90) ~[clojure-1.6.0.jar:?]
at backtype.storm.util$partition_fixed.invoke(util.clj:900) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.AFn.applyToHelper(AFn.java:156) ~[clojure-1.6.0.jar:?]
at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.6.0.jar:?]
at clojure.core$apply.invoke(core.clj:624) ~[clojure-1.6.0.jar:?]
at clojure.lang.AFn.applyToHelper(AFn.java:156) ~[clojure-1.6.0.jar:?]
at clojure.lang.RestFn.applyTo(RestFn.java:132) ~[clojure-1.6.0.jar:?]
at clojure.core$apply.invoke(core.clj:626) ~[clojure-1.6.0.jar:?]
at clojure.core$partial$fn__4228.doInvoke(core.clj:2468) ~[clojure-1.6.0.jar:?]
at clojure.lang.RestFn.invoke(RestFn.java:408) ~[clojure-1.6.0.jar:?]
at backtype.storm.util$map_val$iter__1807__1811$fn__1812.invoke(util.clj:305) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.LazySeq.sval(LazySeq.java:40) ~[clojure-1.6.0.jar:?]
at clojure.lang.LazySeq.seq(LazySeq.java:49) ~[clojure-1.6.0.jar:?]
at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.6.0.jar:?]
at clojure.lang.RT.next(RT.java:598) ~[clojure-1.6.0.jar:?]
at clojure.core$next.invoke(core.clj:64) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6086.invoke(protocols.clj:146) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6057$G__6052__6066.invoke(protocols.clj:19) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$seq_reduce.invoke(protocols.clj:31) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6078.invoke(protocols.clj:54) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6031$G__6026__6044.invoke(protocols.clj:13) ~[clojure-1.6.0.jar:?]
at clojure.core$reduce.invoke(core.clj:6289) ~[clojure-1.6.0.jar:?]
at clojure.core$into.invoke(core.clj:6341) ~[clojure-1.6.0.jar:?]
at backtype.storm.util$map_val.invoke(util.clj:304) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.daemon.nimbus$compute_executors.invoke(nimbus.clj:491) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.daemon.nimbus$compute_executor__GT_component.invoke(nimbus.clj:502) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.daemon.nimbus$read_topology_details.invoke(nimbus.clj:394) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.daemon.nimbus$mk_assignments$iter__7809__7813$fn__7814.invoke(nimbus.clj:722) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.LazySeq.sval(LazySeq.java:40) ~[clojure-1.6.0.jar:?]
at clojure.lang.LazySeq.seq(LazySeq.java:49) ~[clojure-1.6.0.jar:?]
at clojure.lang.RT.seq(RT.java:484) ~[clojure-1.6.0.jar:?]
at clojure.core$seq.invoke(core.clj:133) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6078.invoke(protocols.clj:54) ~[clojure-1.6.0.jar:?]
at clojure.core.protocols$fn__6031$G__6026__6044.invoke(protocols.clj:13) ~[clojure-1.6.0.jar:?]
at clojure.core$reduce.invoke(core.clj:6289) ~[clojure-1.6.0.jar:?]
at clojure.core$into.invoke(core.clj:6341) ~[clojure-1.6.0.jar:?]
at backtype.storm.daemon.nimbus$mk_assignments.doInvoke(nimbus.clj:721) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.RestFn.invoke(RestFn.java:410) ~[clojure-1.6.0.jar:?]
at backtype.storm.daemon.nimbus$fn__8060$exec_fn__3866__auto____8061$fn__8068$fn__8069.invoke(nimbus.clj:1112) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.daemon.nimbus$fn__8060$exec_fn__3866__auto____8061$fn__8068.invoke(nimbus.clj:1111) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.timer$schedule_recurring$this__2489.invoke(timer.clj:102) ~[storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.timer$mk_timer$fn__2472$fn__2473.invoke(timer.clj:50) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.timer$mk_timer$fn__2472.invoke(timer.clj:42) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40]
2017-01-17 19:39:30.873 b.s.util [ERROR] Halting process: ("Error when processing an event")
java.lang.RuntimeException: ("Error when processing an event")
at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:336) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.6.0.jar:?]
at backtype.storm.daemon.nimbus$nimbus_data$fn__7411.invoke(nimbus.clj:118) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.timer$mk_timer$fn__2472$fn__2473.invoke(timer.clj:71) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at backtype.storm.timer$mk_timer$fn__2472.invoke(timer.clj:42) [storm-core-0.10.0.2.3.6.0-3796.jar:0.10.0.2.3.6.0-3796]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40]
2017-01-17 19:39:30.876 b.s.d.nimbus [INFO] Shutting down master
Why is this happening? Storm version is 0.10.0.2.3 Any hint on how to debug more thoroughly this issue? Right now we cannot deploy new topologies
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Storm
08-16-2016
09:44 AM
Thanks for you answer @Constantin Stanca even if I'm sorry to hear it 😞 Well, we'll update to latest 2.3 and wait eagerly for 2.5. Do you have a more or less estimate on when 2.5 will be released as stable?
... View more
08-11-2016
08:06 AM
2 Kudos
Hello we are currently running HDP-2.3.4.0-3485 with Hive ACID and, after upgrading in our staging env to HDP-2.4.2.0-258, we cannot ALTER tables with partitions anymore. Here is an example of a query and the error ALTER TABLE clicks_new CHANGE COLUMN billing billing STRUCT<
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Changing from type struct<cost:float,originalCost:float,currency:string,currencyExchangeRate:float,bid:struct<id:string,amount:float,originalAmount:float,currency:string,currencyExchangeRate:float>,revenue:float,revenueCurrency:string,publisherRevenue:float> to struct<cost:float,originalCost:float,currency:string,currencyExchangeRate:float,bid:struct<id:string,amount:float,originalAmount:float,currency:string,currencyExchangeRate:float>,revenue:float,revenueCurrency:string,publisherRevenue:float,revenueCurrencyExchangeRate:float> is not supported for column billing. SerDe may be incompatible (state=08S01,code=1) This used to work with HDP-2.3, and I know it's something disabled on purpose and that will come back with HDP-2.5 and Hive 2.0, but my question is: is there some setting to re-enable the old feature (with all its limits) as in HDP-2.3?
... View more
Labels:
06-01-2016
09:41 AM
After stopping the Hive mysql server from Ambari UI and issuing curl -u admin:admin -X DELETE -H 'X-Requested-By:admin' http://server:8080/api/v1/clusters/$NAME/hosts/$FQDN/host_components/MYSQL_SERVER
I've successfully removed Hive Mysql from ambari management and I guess that next Hive restarts will not touch anymore Mysql. Thank you @Alejandro Fernandez and all the others too
... View more
06-01-2016
07:25 AM
Thank you for your answers. To clarify: I don't want to touch Ambari's DB (where Ambari store its configs), I want to change the DB where Hive stores its metadata. Actually, it's what I did: as @jeff and @emaxwell said I changed the "Existing MySQL" option in Hive configuration and I pointed it to my own DB, but since Ambari was already managing the myql server, it restarted the mysqld daemon. So, probably I guess that @Alejandro Fernandez answer is right, with the `curl -X DELETE` operation. So to sum it up: we were using the Hive-MySQL DB instance installed by default by Ambari, installed alongside other HDP services in one of our masters. Now I wanted to make that MySQL installation high available so I installed another mysql on another master, made it a slave of the original instance and put a virtual, floating IP dedicated to the MySQL service. Then I changed the Hive MySQL address in Hive configuration to use the new VIP (that was at that moment pointing to the original mysql instance) and applied the new Hive config. It's then when Ambari decided to restart my original MySQL instance (and the VIP consequently moved to the MySQL slave). Hope it's more clear now 🙂
... View more
05-31-2016
07:37 PM
Hello we have an Ambari 2.3 installation with Hive using a local mysql installation as database. Now, we have implemented an HA solution for MySQL, MasterHA for the mater, which is a bunch of scripts+daemon that monitor if mysql is alive and move its floating IP to another slave (a slave promotion) in case of master failure. Doing the changes (when I changed the MySQL IP in Ambari), Ambari restarted the mysqld instance, triggering the master failover, which by the way worked well 🙂 So my question is: to avoid interference between Ambari and MasterHA, how can I tell Ambari that it shouldn't manage the mysql server, in a running Ambari installation? Thanks!
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive