About davide1

davide1 · ‎10-25-2017

Ok, solved, it was indeed a permission problem. But this should be considered a bug nonetheless. The message doesn't tell anything about the issue PLUS it starts an automated loop of error box -> logging out -> logging in -> error box with no options to jump out of it but waiting for the session to expire with the browser tab closed

davide1 · ‎10-25-2017

The only "strange" thing I see is in the ambari-audit.log, about my user (vide): 2017-10-25T12:23:50.729+0200, User(vide), RemoteIp(192.168.150.13), Operation(Request from server), RequestType(POST), url(http://ambari-data.billy.preprod/api/v1/requests), ResultStatus(403 Forbidden), Reason(The authenticated user is not authorized to execute the action check_host.), Command(null), Cluster name(null) I should be in the cluster admin group..I'll try again with the local admin user...

davide1 · ‎10-25-2017

@Jay SenSharma No, I'm trying to ADD the hosts to the cluster, they are currently not present. Screenshot: Ambari-metrics is installed but it hasn't run yet because the host is still not in any ambari cluster. ambari-agent.log (DEBUG enabled): INFO 2017-10-25 15:25:42,947 DataCleaner.py:39 - Data cleanup thread started INFO 2017-10-25 15:25:42,949 DataCleaner.py:120 - Data cleanup started INFO 2017-10-25 15:25:42,949 DataCleaner.py:122 - Data cleanup finished INFO 2017-10-25 15:25:42,971 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2017-10-25 15:25:42,972 main.py:181 - Newloglevel=logging.DEBUG INFO 2017-10-25 15:25:42,972 main.py:436 - Connecting to Ambari server at https://ambari-data.billy.preprod:8440 (192.168.40.120) DEBUG 2017-10-25 15:25:42,973 NetUtil.py:110 - Trying to connect to https://ambari-data.billy.preprod:8440 INFO 2017-10-25 15:25:42,973 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/ca DEBUG 2017-10-25 15:25:43,064 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/ca -> 200, body: INFO 2017-10-25 15:25:43,064 main.py:446 - Connected to Ambari server ambari-data.billy.preprod DEBUG 2017-10-25 15:25:43,064 Controller.py:66 - Initializing Controller RPC thread. INFO 2017-10-25 15:25:43,065 threadpool.py:58 - Started thread pool with 3 core threads and 20 maximum threads WARNING 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:280 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs. INFO 2017-10-25 15:25:43,065 AlertSchedulerHandler.py:175 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1779cd0>; currently running: False DEBUG 2017-10-25 15:25:43,066 scheduler.py:574 - Scheduler started DEBUG 2017-10-25 15:25:43,066 scheduler.py:579 - Looking for jobs to run DEBUG 2017-10-25 15:25:43,066 scheduler.py:599 - No jobs; waiting until a job is added INFO 2017-10-25 15:25:45,085 hostname.py:98 - Read public hostname 'druid-co01.billy.preprod' using socket.getfqdn() INFO 2017-10-25 15:25:45,127 Hardware.py:174 - Some mount points were ignored: /dev/shm, /run, /sys/fs/cgroup, /run/user/0 DEBUG 2017-10-25 15:25:45,296 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result DEBUG 2017-10-25 15:25:45,297 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2017-10-25 15:25:45,299 Controller.py:170 - Registering with druid-co01.billy.preprod (192.168.40.52) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "billy.preprod", "physicalprocessorcount": 4, "kernelrelease": "3.10.0-693.5.2.el7.x86_64", "uptime_days": "0", "memorytotal": 8002256, "swapfree": "0.00 GB", "memorysize": 8002256, "osfamily": "redhat", "swapsize": "0.00 GB", "processorcount": 4, "netmask": "255.255.255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 6255416, "operatingsystem": "centos", "kernelmajversion": "3.10", "kernelversion": "3.10.0", "macaddress": "00:1A:4A:16:01:AD", "operatingsystemrelease": "7.4.1708", "ipaddress": "192.168.40.52", "hostname": "druid-co01", "uptime_hours": "5", "fqdn": "druid-co01.billy.preprod", "id": "root", "architecture": "x86_64", "selinux": false, "mounts": [{"available": "5313004", "used": "3064340", "percent": "37%", "device": "/dev/vda1", "mountpoint": "/", "type": "xfs", "size": "8377344"}, {"available": "3978004", "used": "0", "percent": "0%", "device": "devtmpfs", "mountpoint": "/dev", "type": "devtmpfs", "size": "3978004"}], "hardwaremodel": "x86_64", "uptime_seconds": "18331", "interfaces": "eth0,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.5.0.3", "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937945297, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 11784, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 11886, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "timestamp": 1508937945235, "hostname": "druid-co01.billy.preprod", "responseId": -1, "publicHostname": "druid-co01.billy.preprod"}') INFO 2017-10-25 15:25:45,300 NetUtil.py:67 - Connecting to https://ambari-data.billy.preprod:8440/connection_info DEBUG 2017-10-25 15:25:45,385 NetUtil.py:87 - GET https://ambari-data.billy.preprod:8440/connection_info -> 200, body: {"security.server.two_way_ssl":"false"} DEBUG 2017-10-25 15:25:45,385 security.py:52 - Server two-way SSL authentication required: False INFO 2017-10-25 15:25:45,385 security.py:93 - SSL Connect being called.. connecting to the server INFO 2017-10-25 15:25:45,516 security.py:60 - SSL connection established. Two-way SSL authentication is turned off on the server. DEBUG 2017-10-25 15:25:45,606 Controller.py:177 - Registration response is {u'agentConfig': {u'agent.auto.cache.update': u'true', u'agent.check.mounts.timeout': u'0', u'agent.check.remote.mounts': u'false'}, u'exitstatus': 0, u'response': u'OK', u'responseId': 0, u'statusCommands': []} INFO 2017-10-25 15:25:45,606 Controller.py:196 - Registration Successful (response id = 0) INFO 2017-10-25 15:25:45,606 AmbariConfig.py:316 - Updating config property (agent.check.remote.mounts) with value (false) INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.auto.cache.update) with value (true) INFO 2017-10-25 15:25:45,607 AmbariConfig.py:316 - Updating config property (agent.check.mounts.timeout) with value (0) DEBUG 2017-10-25 15:25:45,607 Controller.py:205 - Updated config:<AmbariConfig.AmbariConfig instance at 0x174bef0> DEBUG 2017-10-25 15:25:45,607 Controller.py:212 - Got status commands on registration. DEBUG 2017-10-25 15:25:45,607 Controller.py:256 - No status commands received from ambari-data.billy.preprod WARNING 2017-10-25 15:25:45,607 AlertSchedulerHandler.py:123 - There are no alert definition commands in the heartbeat; unable to update definitions INFO 2017-10-25 15:25:45,607 Controller.py:512 - Registration response from ambari-data.billy.preprod was OK INFO 2017-10-25 15:25:45,607 Controller.py:517 - Resetting ActionQueue... INFO 2017-10-25 15:25:55,619 Controller.py:304 - Heartbeat (response id = 0) with server is running... INFO 2017-10-25 15:25:55,619 Controller.py:311 - Building heartbeat message DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:83 - Building Heartbeat: {responseId = 0, timestamp = 1508937955620, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1} DEBUG 2017-10-25 15:25:55,620 Heartbeat.py:86 - Heartbeat: {'componentStatus': [], 'hostname': 'druid-co01.billy.preprod', 'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'}, 'recoveryReport': {'summary': 'DISABLED'}, 'recoveryTimestamp': -1, 'reports': [], 'responseId': 0, 'timestamp': 1508937955620} INFO 2017-10-25 15:25:55,621 Heartbeat.py:90 - Adding host info/state to heartbeat message. DEBUG 2017-10-25 15:25:55,733 HostCheckReportFileHandler.py:126 - Host check report at /var/lib/ambari-agent/data/hostcheck.result DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:177 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result DEBUG 2017-10-25 15:25:55,734 HostCheckReportFileHandler.py:182 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2017-10-25 15:25:55,770 Hardware.py:174 - Some mount points were ignored: /, /dev, /dev/shm, /run, /sys/fs/cgroup, /run/user/0 DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:100 - agentEnv: {'transparentHugePage': '', 'hostHealth': {'agentTimeStampAtReporting': 1508937955734, 'activeJavaProcs': [{'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord', 'pid': 12022, 'hadoop': True, 'user': 'root'}, {'command': u'/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator', 'pid': 12023, 'hadoop': True, 'user': 'root'}], 'liveServices': [{'status': 'Healthy', 'name': 'ntpd or chronyd', 'desc': ''}]}, 'reverseLookup': True, 'alternatives': [], 'umask': '18', 'firewallName': 'iptables', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': True} DEBUG 2017-10-25 15:25:55,771 Heartbeat.py:101 - mounts: [] INFO 2017-10-25 15:25:55,771 Controller.py:318 - Sending Heartbeat (id = 0): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937955620, "hostname": "druid-co01.billy.preprod", "responseId": 0, "reports": [], "mounts": [], "recoveryTimestamp": -1, "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1508937955734, "activeJavaProcs": [{"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server overlord", "pid": 12022, "hadoop": true, "user": "root"}, {"command": "/usr/bin/java -server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -classpath .:/etc/druid/:/etc/hadoop/conf/:/opt/druid/lib/* io.druid.cli.Main server coordinator", "pid": 12023, "hadoop": true, "user": "root"}], "liveServices": [{"status": "Healthy", "name": "ntpd or chronyd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": true}, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []} INFO 2017-10-25 15:25:55,775 Controller.py:332 - Heartbeat response received (id = 1) INFO 2017-10-25 15:25:55,775 Controller.py:341 - Heartbeat interval is 10 seconds INFO 2017-10-25 15:25:55,775 Controller.py:377 - Updating configurations from heartbeat INFO 2017-10-25 15:25:55,775 Controller.py:386 - Adding cancel/execution commands DEBUG 2017-10-25 15:25:55,775 Controller.py:246 - No commands received from ambari-data.billy.preprod DEBUG 2017-10-25 15:25:55,775 Controller.py:256 - No status commands received from ambari-data.billy.preprod INFO 2017-10-25 15:25:55,775 Controller.py:403 - Adding recovery commands DEBUG 2017-10-25 15:25:55,775 Controller.py:422 - No commands sent from ambari-data.billy.preprod INFO 2017-10-25 15:25:55,775 Controller.py:471 - Waiting 9.9 for next heartbeat INFO 2017-10-25 15:26:05,677 Controller.py:478 - Wait for next heartbeat over DEBUG 2017-10-25 15:26:05,677 Controller.py:304 - Heartbeat (response id = 1) with server is running... DEBUG 2017-10-25 15:26:05,677 Controller.py:311 - Building heartbeat message DEBUG 2017-10-25 15:26:05,678 Heartbeat.py:83 - Building Heartbeat: {responseId = 1, timestamp = 1508937965678, commandsInProgress = False, componentsMapped = False,recoveryTimestamp = -1} DEBUG 2017-10-25 15:26:05,679 Heartbeat.py:86 - Heartbeat: {'componentStatus': [], 'hostname': 'druid-co01.billy.preprod', 'nodeStatus': {'cause': 'NONE', 'status': 'HEALTHY'}, 'recoveryReport': {'summary': 'DISABLED'}, 'recoveryTimestamp': -1, 'reports': [], 'responseId': 1, 'timestamp': 1508937965678} DEBUG 2017-10-25 15:26:05,680 Controller.py:318 - Sending Heartbeat (id = 1): {"alerts": [], "nodeStatus": {"status": "HEALTHY", "cause": "NONE"}, "timestamp": 1508937965678, "hostname": "druid-co01.billy.preprod", "responseId": 1, "reports": [], "recoveryTimestamp": -1, "recoveryReport": {"summary": "DISABLED"}, "componentStatus": []} DEBUG 2017-10-25 15:26:05,683 Controller.py:332 - Heartbeat response received (id = 2) DEBUG 2017-10-25 15:26:05,683 Controller.py:341 - Heartbeat interval is 10 seconds DEBUG 2017-10-25 15:26:05,683 Controller.py:377 - Updating configurations from heartbeat DEBUG 2017-10-25 15:26:05,684 Controller.py:386 - Adding cancel/execution commands DEBUG 2017-10-25 15:26:05,684 Controller.py:246 - No commands received from ambari-data.billy.preprod DEBUG 2017-10-25 15:26:05,684 Controller.py:256 - No status commands received from ambari-data.billy.preprod DEBUG 2017-10-25 15:26:05,684 Controller.py:403 - Adding recovery commands DEBUG 2017-10-25 15:26:05,684 Controller.py:422 - No commands sent from ambari-data.billy.preprod DEBUG 2017-10-25 15:26:05,684 Controller.py:471 - Waiting 9.9 for next heartbeat And the last part of the server log: 25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:103 - Request https://ambari-data.billy.preprod:8440/ca doesn't match any pattern. 25 Oct 2017 15:25:43,107 WARN [qtp-ambari-agent-39] SecurityFilter:62 - This request is not allowed on this port: https://ambari-data.billy.preprod:8440/ca 25 Oct 2017 15:25:45,570 INFO [qtp-ambari-agent-36] HeartBeatHandler:425 - agentOsType = centos7 25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] HostImpl:329 - Received host registration, host=[hostname=druid-co01,fqdn=druid-co01.billy.preprod,domain=billy.preprod,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=7.4.1708,osfamily=redhat,memory=8002256,uptime_hours=5,mounts=(available=5313004,mountpoint=/,used=3064340,percent=37%,size=8377344,device=/dev/vda1,type=xfs)(available=3978004,mountpoint=/dev,used=0,percent=0%,size=3978004,device=devtmpfs,type=devtmpfs)] , registrationTime=1508937945570, agentVersion=2.5.0.3 25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:548 - TopologyManager.onHostRegistered: Entering 25 Oct 2017 15:25:45,647 INFO [qtp-ambari-agent-36] TopologyManager:602 - Host druid-co01.billy.preprod re-registered, will not be added to the available hosts list

davide1 · ‎10-25-2017

Hello, using Ambari Server 2.5.0.3-7 and Agent 2.5.0.3 and I get this error when manually adding a new host to an Ambari Cluster. The agent was preinstalled and preconfigured and the new host is running CentOS 7.4 No useful info (warning, error) in either agent or server logs.

davide1 · ‎07-11-2017

Hello @Michael Dennis "MD" Uanang can you confirm that this worked on a 3-nodes Zookeeper install? I mean, I need to move my zk cluster from the original 3 hosts to other 3 hosts, will this work repeating 3 times the same procedure?

davide1 · ‎03-08-2017

Thank you very much @Roland Simonis I had the same problem (reinstalled a slave from scratch, and decommisioned via Ambari both DN and NM) but Ambari doesn't have a GUI option for recommissioning the NM, so the ResourceManager was always denying access. With the API call you posted I easily recommissioned the NM and now everything is working again as expected. Thanks!

davide1 · ‎03-01-2017

Hello how can I install HDP 2.5.3 with CloudBreak latest release 1.6.3? I'm using a blueprint for Ambari with HDP: 2.5 but it installs 2.5.0 and not the latest HDP revision

davide1 · ‎02-16-2017

This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3

davide1 · ‎02-16-2017

No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds

davide1 · ‎02-16-2017

@Eugene Koifman @Wei Zheng this seems related to https://issues.apache.org/jira/browse/HIVE-15142 do you have any idea about our problem?

Online	Offline
Last Visited	‎10-25-2017 01:39 PM

Member Since	‎05-31-2016 03:46 PM
Last Visited	‎10-25-2017 01:39 PM
Posts	23
Kudos received	4

Cloudera Community

Re: Ambari error: All bootstrapped hosts registere...

Re: Ambari error: All bootstrapped hosts registere...

Re: Ambari error: All bootstrapped hosts registere...

Re: Ambari error: All bootstrapped hosts registere...

Ambari error: All bootstrapped hosts registered bu...

Re: What are the steps for moving a zookeeper serv...

Re: How can i recommission Node managers and Data ...

Install HDP 2.5.3 with CloudBreak 1.6.3

Re: hive transactional table compaction fails

Re: hive transactional table compaction fails

Re: hive transactional table compaction fails