Member since
05-31-2016
23
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2838 | 10-25-2017 01:35 PM |
10-25-2017
01:38 PM
Ok, solved, it was indeed a permission problem. But this should be considered a bug nonetheless. The message doesn't tell anything about the issue PLUS it starts an automated loop of error box -> logging out -> logging in -> error box with no options to jump out of it but waiting for the session to expire with the browser tab closed
... View more
10-25-2017
01:35 PM
The only "strange" thing I see is in the ambari-audit.log, about my user (vide): 2017-10-25T12:23:50.729+0200, User(vide), RemoteIp(192.168.150.13), Operation(Request from server), RequestType(POST), url(http://ambari-data.billy.preprod/api/v1/requests), ResultStatus(403 Forbidden), Reason(The authenticated user is not authorized to execute the action check_host.), Command(null), Cluster name(null)
I should be in the cluster admin group..I'll try again with the local admin user...
... View more
10-25-2017
01:29 PM
@Jay SenSharma No, I'm trying to ADD the hosts to the cluster, they are currently not present. Screenshot: Ambari-metrics is installed but it hasn't run yet because the host is still not in any ambari cluster. ambari-agent.log (DEBUG enabled): INFO 2017-10-25 15:25:42,947 DataCleaner.py:39 - Data cleanup thread started
INFO 2017-10-25 15:25:42,949 DataCleaner.py:120 - Data cleanup started
INFO 2017-10-25 15:25:42,949 DataCleaner.py:122 - Data cleanup finished
INFO 2017-10-25 15:25:42,971 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2017-10-25 15:25:42,972 main.py:181 - Newloglevel=logging.DEBUG
INFO 2017-10-25 15:25:42,972 main.py:436 - Connecting to Ambari server at https://ambari-data.billy.preprod:8440 (192.168.40.120)
DEBUG 2017-10-25 15:25:42,973 NetUtil.py:110 - Trying to connect to https://ambari-data.billy.preprod:8440
INFO 2017-10-25 15:25:42,973 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/ca
DEBUG 2017-10-25 15:25:43,064 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/ca -> 200, body:
INFO 2017-10-25 15:25:43,064 main.py:446 - Connected to Ambari server ambari-data.billy.preprod
DEBUG 2017-10-25 15:25:43,064 Controller.py:66 - Initializing Controller RPC thread.
INFO 2017-10-25 15:25:43,065 threadpool.py:58 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:280 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:175 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1779cd0>; currently running: False
DEBUG 2017-10-25 15:25:43,066 scheduler.py:574 - Scheduler started
DEBUG 2017-10-25 15:25:43,066 scheduler.py:579 - Looking for jobs to run
DEBUG 2017-10-25 15:25:43,066 scheduler.py:599 - No jobs; waiting until a job is added
INFO 2017-10-25 15:25:45,085 hostname.py:98 - Read public hostname 'druid-co01.billy.preprod' using socket.getfqdn()
INFO 2017-10-25 15:25:45,127 Hardware.py:174 - Some mount points were ignored: /dev/shm, /run, /sys/fs/cgroup, /run/user/0
DEBUG 2017-10-25 15:25:45,296 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2017-10-25 15:25:45,299 Controller.py:170 - Registering with druid-co01.billy.preprod (192.168.40.52) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "billy.preprod", "physicalprocessorcount": 4, "kernelrelease": "3.10.0-693.5.2.el7.x86_64", "uptime_days": "0", "memorytotal": 8002256, "swapfree": "0.00 GB", "memorysize": 8002256, "osfamily": "redhat", "swapsize": "0.00 GB", "processorcount": 4, "netmask": "255.255.255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 6255416, "operatingsystem": "centos", "kernelmajversion": "3.10", "kernelversion": "3.10.0", "macaddress": "00:1A:4A:16:01:AD", "operatingsystemrelease": "7.4.1708", "ipaddress": "192.168.40.52", "hostname": "druid-co01", "uptime_hours": "5", "fqdn": "druid-co01.billy.preprod", "id": "root", "architecture": "x86_64", "selinux": false, "mounts": [{"available": "5313004", "used": "3064340", "percent": "37%", "device": "/dev/vda1", "mountpoint": "/", "type": "xfs", "size": "8377344"}, {"available": "3978004", "used": "0", "percent": "0%", "device": "devtmpfs", "mountpoint": "/dev", "type": "devtmpfs", "size": "3978004"}], "hardwaremodel": "x86_64", "uptime_seconds": "18331", "interfaces": "eth0,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.5.0.3", "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937945297, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 11784, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 11886, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "timestamp": 1508937945235, "hostname": "druid-co01.billy.preprod", "responseId": -1, "publicHostname": "druid-co01.billy.preprod"}')
INFO 2017-10-25 15:25:45,300 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/connection_info
DEBUG 2017-10-25 15:25:45,385 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/connection_info -> 200, body: {"security.server.two_way_ssl":"false"}
DEBUG 2017-10-25 15:25:45,385 security.py:52 - Server two-way SSL authentication required: False
INFO 2017-10-25 15:25:45,385 security.py:93 - SSL Connect being called.. connecting to the server
INFO 2017-10-25 15:25:45,516 security.py:60 - SSL connection established. Two-way SSL authentication is turned off on the server.
DEBUG 2017-10-25 15:25:45,606 Controller.py:177 - Registration response is {u'agentConfig': {u'agent.auto.cache.update': u'true',
u'agent.check.mounts.timeout': u'0',
u'agent.check.remote.mounts': u'false'},
u'exitstatus': 0,
u'response': u'OK',
u'responseId': 0,
u'statusCommands': []}
INFO 2017-10-25 15:25:45,606 Controller.py:196 - Registration Successful (response id = 0)
INFO 2017-10-25 15:25:45,606 AmbariConfig.py:316 - Updating config property (agent.check.remote.mounts) with value (false)
INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.auto.cache.update) with value (true)
INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.check.mounts.timeout) with value (0)
DEBUG 2017-10-25 15:25:45,607 Controller.py:205 - Updated config:<AmbariConfig.AmbariConfig instance at 0x174bef0>
DEBUG 2017-10-25 15:25:45,607 Controller.py:212 - Got status commands on registration.
DEBUG 2017-10-25 15:25:45,607 Controller.py:256 - No status commands received from ambari-data.billy.preprod
WARNING 2017-10-25 15:25:45,607 AlertSchedulerHandler.py:123 - There are no alert definition commands in the heartbeat; unable to update definitions
INFO 2017-10-25 15:25:45,607 Controller.py:512 - Registration response from ambari-data.billy.preprod was OK
INFO 2017-10-25 15:25:45,607 Controller.py:517 - Resetting ActionQueue...
INFO 2017-10-25 15:25:55,619 Controller.py:304 - Heartbeat (response id = 0) with server is running...
INFO 2017-10-25 15:25:55,619 Controller.py:311 - Building heartbeat message
DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:83 - Building Heartbeat: {responseId = 0, timestamp = 1508937955620, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1}
DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:86 - Heartbeat: {'componentStatus': [],
'hostname': 'druid-co01.billy.preprod',
'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'},
'recoveryReport': {'summary': 'DISABLED'},
'recoveryTimestamp': -1,
'reports': [],
'responseId': 0,
'timestamp': 1508937955620}
INFO 2017-10-25 15:25:55,621 Heartbeat.py:90 - Adding host info/state to heartbeat message.
DEBUG 2017-10-25 15:25:55,733 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result
DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2017-10-25 15:25:55,770 Hardware.py:174 - Some mount points were ignored: /, /dev, /dev/shm, /run, /sys/fs/cgroup, /run/user/0
DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:100 - agentEnv: {'transparentHugePage': '', 'hostHealth': {'agentTimeStampAtReporting': 1508937955734, 'activeJavaProcs': [{'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord', 'pid': 12022, 'hadoop': True, 'user': 'root'}, {'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator', 'pid': 12023, 'hadoop': True, 'user': 'root'}], 'liveServices': [{'status': 'Healthy', 'name': 'ntpd or chronyd', 'desc': ''}]}, 'reverseLookup': True, 'alternatives': [], 'umask': '18', 'firewallName': 'iptables', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': True}
DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:101 - mounts: []
INFO 2017-10-25 15:25:55,771 Controller.py:318 - Sending Heartbeat (id = 0): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937955620, "hostname": "druid-co01.billy.preprod", "responseId": 0, "reports": [], "mounts": [], "recoveryTimestamp": -1, "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937955734, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 12022, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 12023, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []}
INFO 2017-10-25 15:25:55,775 Controller.py:332 - Heartbeat response received (id = 1)
INFO 2017-10-25 15:25:55,775 Controller.py:341 - Heartbeat interval is 10 seconds
INFO 2017-10-25 15:25:55,775 Controller.py:377 - Updating configurations from heartbeat
INFO 2017-10-25 15:25:55,775 Controller.py:386 - Adding cancel/execution commands
DEBUG 2017-10-25 15:25:55,775 Controller.py:246 - No commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:25:55,775 Controller.py:256 - No status commands received from ambari-data.billy.preprod
INFO 2017-10-25 15:25:55,775 Controller.py:403 - Adding recovery commands
DEBUG 2017-10-25 15:25:55,775 Controller.py:422 - No commands sent from ambari-data.billy.preprod
INFO 2017-10-25 15:25:55,775 Controller.py:471 - Waiting 9.9 for next heartbeat
INFO 2017-10-25 15:26:05,677 Controller.py:478 - Wait for next heartbeat over
DEBUG 2017-10-25 15:26:05,677 Controller.py:304 - Heartbeat (response id = 1) with server is running...
DEBUG 2017-10-25 15:26:05,677 Controller.py:311 - Building heartbeat message
DEBUG 2017-10-25 15:26:05,678 Heartbeat.py:83 - Building Heartbeat: {responseId = 1, timestamp = 1508937965678, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1}
DEBUG 2017-10-25 15:26:05,679 Heartbeat.py:86 - Heartbeat: {'componentStatus': [],
'hostname': 'druid-co01.billy.preprod',
'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'},
'recoveryReport': {'summary': 'DISABLED'},
'recoveryTimestamp': -1,
'reports': [],
'responseId': 1,
'timestamp': 1508937965678}
DEBUG 2017-10-25 15:26:05,680 Controller.py:318 - Sending Heartbeat (id = 1): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937965678, "hostname": "druid-co01.billy.preprod", "responseId": 1, "reports": [], "recoveryTimestamp": -1, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []}
DEBUG 2017-10-25 15:26:05,683 Controller.py:332 - Heartbeat response received (id = 2)
DEBUG 2017-10-25 15:26:05,683 Controller.py:341 - Heartbeat interval is 10 seconds
DEBUG 2017-10-25 15:26:05,683 Controller.py:377 - Updating configurations from heartbeat
DEBUG 2017-10-25 15:26:05,684 Controller.py:386 - Adding cancel/execution commands
DEBUG 2017-10-25 15:26:05,684 Controller.py:246 - No commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:256 - No status commands received from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:403 - Adding recovery commands
DEBUG 2017-10-25 15:26:05,684 Controller.py:422 - No commands sent from ambari-data.billy.preprod
DEBUG 2017-10-25 15:26:05,684 Controller.py:471 - Waiting 9.9 for next heartbeat
And the last part of the server log: 25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:103 - Request https://ambari-data.billy.preprod:8440/ca doesn't match any pattern.
25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:62 - This request is not allowed on this port: https://ambari-data.billy.preprod:8440/ca
25 Oct 2017 15:25:45,570 INFO [qtp-ambari-agent-36] HeartBeatHandler:425 - agentOsType = centos7
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] HostImpl:329 - Received host registration, host=[hostname=druid-co01,fqdn=druid-co01.billy.preprod,domain=billy.preprod,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=7.4.1708,osfamily=redhat,memory=8002256,uptime_hours=5,mounts=(available=5313004,mountpoint=/,used=3064340,percent=37%,size=8377344,device=/dev/vda1,type=xfs)(available=3978004,mountpoint=/dev,used=0,percent=0%,size=3978004,device=devtmpfs,type=devtmpfs)]
, registrationTime=1508937945570, agentVersion=2.5.0.3
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:548 - TopologyManager.onHostRegistered: Entering
25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:602 - Host druid-co01.billy.preprod re-registered, will not be added to the available hosts list
... View more
10-25-2017
10:42 AM
Hello, using Ambari Server 2.5.0.3-7 and Agent 2.5.0.3 and I get this error when manually adding a new host to an Ambari Cluster. The agent was preinstalled and preconfigured and the new host is running CentOS 7.4 No useful info (warning, error) in either agent or server logs.
... View more
Labels:
- Labels:
-
Apache Ambari
07-11-2017
01:23 PM
Hello @Michael Dennis "MD" Uanang can you confirm that this worked on a 3-nodes Zookeeper install? I mean, I need to move my zk cluster from the original 3 hosts to other 3 hosts, will this work repeating 3 times the same procedure?
... View more
03-08-2017
11:16 AM
Thank you very much @Roland Simonis
I had the same problem (reinstalled a slave from scratch, and decommisioned via Ambari both DN and NM) but Ambari doesn't have a GUI option for recommissioning the NM, so the ResourceManager was always denying access. With the API call you posted I easily recommissioned the NM and now everything is working again as expected. Thanks!
... View more
03-01-2017
04:24 PM
Hello how can I install HDP 2.5.3 with CloudBreak latest release 1.6.3? I'm using a blueprint for Ambari with HDP: 2.5 but it installs 2.5.0 and not the latest HDP revision
... View more
Labels:
02-16-2017
05:03 PM
This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3
... View more
02-16-2017
05:02 PM
No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds
... View more
02-16-2017
04:35 PM
@Eugene Koifman @Wei Zheng
this seems related to https://issues.apache.org/jira/browse/HIVE-15142 do you have any idea about our problem?
... View more