Created 08-19-2017 05:53 AM
I am setting up a 3 node HDP cluster using Ambari. I have installed ambari server and continuing step wise installation through web UI. I had completed setup till step 9, install,start and test but it was not moving ahead of 32%(see screenshot 2017-08-19.png). So, I restarted the ambari server and now the UI has started from first step again.
Now, I have reached step 3, confirm hosts. The hosts are registered and status is success but The UI is not moving ahead and the message being show is 'Please wait while the hosts are being checked for potential problems...' (see screenshot 2017-08-19-2.png) I tried waiting for a long time but still it doesn't progresses further.
I have also tried restarting ambari agents, restarting ambari server but still same result. These steps I am performing now are already passed once successfully and I haven't changed anything major on the nodes.
Please suggest a solution.
Created 08-19-2017 08:49 PM
Ambari server log
2017-08-18 11:17:39,720 [CRITICAL] [HIVE] [hive_server_process] (HiveServer2 Process) Connection failed on host hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net:10000 (Traceback (most recent call last):
Ensure ambari agent is running and the port is is free
(Ambari Agent Heartbeat) hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net
copy the Ambari,HDP* .repo to /etc/yum.repos.d/ to all other hosts
Confirm the repos are accessible by
# yum repolist
You should see something like this
HDP-2.3.2.0 | 2.9 kB 00:00 HDP-UTILS-1.1.0.20 | 2.9 kB 00:00 Updates-ambari-2.1.2.1 | 2.9 kB 00:00
Check the ambari-agents on these nodes are running if not restart them ensure the value hostname points to your ambari server in the /etc/ambari-agent/conf/ambari-agent.ini
[server] hostname={your-ambari-server} url_port=8440 secured_url_port=8441 hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net is not sending heartbeats hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net is not sending heartbeats hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net is not sending heartbeats
Error
Caused by: org.apache.ambari.server.HostNotFoundException: Host not found, hostname=
Double check your DNS.
$hostname -f
The output should be FQDN I see a lot of connection refused in the log can you ensure the ambari server can access the other hosts in the cluster
Created 08-20-2017 05:45 AM
Thanks for bearing with me.
I checked all the configurations as per your suggestion.
1) Ambari-server and agents (all hosts) are running.
2) the repolist have entries as required :
repo id !HDP-2.4 !HDP-UTILS-1.1.0.20 !Updates-ambari-2.2.2.0
3) /etc/ambari-agent/conf/ambari-agent.ini shows hostname for server (all 3 nodes):
[server] hostname=hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net url_port=8440 secured_url_port=8441 [agent] prefix=/var/lib/ambari-agent/data .....
and the hostnames are also correct
[anup@hdp25-node3 ~]$ cat /etc/hosts 127.0.0.1 hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.0.0.4 hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 10.0.0.5 hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 10.0.0.6 hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net
I checked the SELinux configuration with `getenforce` and it was permissible so I changed it to `disabled` and rebooted the nodes so that it can take effect.
But, I think this restart has messed up some config of my server. All the setting related to ambari are as given above. But now I cant access the UI like before. I get 'The site cant be find' error.
My setup is based on Microsoft azure, I have tried using the url in different ways (public-ip, hostname)
I have been checking several options like (see from other post)
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X POST -d '{"wizard-data":"{\"userName\":\"admin\",\"controllerName\":\"\"}"}' http://40.121.204.95:8080/api/v1/persist
This one times out.
I dont have any iptables setup. Ambari is show in ps and also 8080 is listening:
[root@hdp25-node1 anup]# service iptables stop Redirecting to /bin/systemctl stop iptables.service Failed to stop iptables.service: Unit iptables.service not loaded. [root@hdp25-node1 anup]# ps -aux | grep ambari-server root 1965 0.0 0.0 11636 624 pts/0 S 05:32 0:00 /bin/sh -c ulimit -n 10000 ; /opt/jdk1.8.0_141/bin/java -server -XX:NewRatio=3 -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit -XX:CMSInitiatingOccupancyFraction=60 -Dsun.zip.disableMemoryMapping=true -Xms512m -Xmx2048m -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Xms512m -Xmx2048m -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -cp '/etc/ambari-server/conf:/usr/lib/ambari-server/*:/usr/share/java/postgresql-jdbc.jar' org.apache.ambari.server.controller.AmbariServer > /var/log/ambari-server/ambari-server.out 2>&1 || echo $? > /var/run/ambari-server/ambari-server.exitcode & root 1966 20.0 3.0 5683744 500464 pts/0 Sl 05:32 1:07 /opt/jdk1.8.0_141/bin/java -server -XX:NewRatio=3 -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit -XX:CMSInitiatingOccupancyFraction=60 -Dsun.zip.disableMemoryMapping=true -Xms512m -Xmx2048m -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Xms512m -Xmx2048m -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -cp /etc/ambari-server/conf:/usr/lib/ambari-server/*:/usr/share/java/postgresql-jdbc.jar org.apache.ambari.server.controller.AmbariServer root 2570 0.0 0.0 112660 976 pts/0 S+ 05:38 0:00 grep --color=auto ambari-server [root@hdp25-node1 anup]# netstat -anop | grep 8080 tcp6 0 0 :::8080 :::* LISTEN 1966/java off (0.00/0/0)
Please let know which bit I am missing.
Latest logs:
tail -fn 100 /var/log/ambari-server/ambari-server.log
[root@hdp25-node1 anup]# tail -fn 100 /var/log/ambari-server/ambari-server.log 20 Aug 2017 05:33:06,674 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HST_SERVER on hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,686 INFO [main] AmbariServer:534 - ********* Initializing ActionManager ********** 20 Aug 2017 05:33:06,687 INFO [main] AmbariServer:537 - ********* Initializing Controller ********** 20 Aug 2017 05:33:06,687 INFO [main] AmbariServer:541 - ********* Initializing Scheduled Request Manager ********** 20 Aug 2017 05:33:06,689 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component SPARK_JOBHISTORYSERVER on hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,699 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component SUPERVISOR on hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,701 INFO [main] Server:272 - jetty-8.1.17.v20150415 20 Aug 2017 05:33:06,709 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component NODEMANAGER on hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,720 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component ZOOKEEPER_SERVER on hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,990 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:157 - Heartbeat lost from host hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,992 INFO [ambari-hearbeat-monitor] TopologyManager:387 - Hearbeat for host hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net lost thus removing it from available hosts. 20 Aug 2017 05:33:06,993 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component METRICS_MONITOR on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,994 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component METRICS_COLLECTOR on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,995 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component FLUME_HANDLER on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,997 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component DATANODE on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:06,999 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HST_AGENT on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,000 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component SUPERVISOR on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,002 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component NODEMANAGER on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,003 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component ZOOKEEPER_SERVER on hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,096 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:157 - Heartbeat lost from host hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,101 INFO [ambari-hearbeat-monitor] TopologyManager:387 - Hearbeat for host hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net lost thus removing it from available hosts. 20 Aug 2017 05:33:07,101 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component METRICS_MONITOR on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,102 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component FLUME_HANDLER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,103 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component SECONDARY_NAMENODE on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,104 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component DATANODE on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,105 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HIVE_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,105 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component MYSQL_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,106 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HIVE_METASTORE on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,107 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component WEBHCAT_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,108 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HISTORYSERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,109 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component OOZIE_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,110 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component HST_AGENT on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,111 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component SUPERVISOR on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,111 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component NIMBUS on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,112 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component DRPC_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,113 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component STORM_UI_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,113 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component NODEMANAGER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,114 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component APP_TIMELINE_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,115 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component RESOURCEMANAGER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:07,116 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:172 - Setting component state to UNKNOWN for component ZOOKEEPER_SERVER on hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 20 Aug 2017 05:33:16,570 INFO [main] AbstractConnector:338 - Started SelectChannelConnector@0.0.0.0:8080 20 Aug 2017 05:33:16,570 INFO [main] Server:272 - jetty-8.1.17.v20150415 20 Aug 2017 05:33:18,680 INFO [main] SslContextFactory:300 - Enabled Protocols [SSLv2Hello, TLSv1, TLSv1.1, TLSv1.2] of [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2] 20 Aug 2017 05:33:18,681 INFO [main] AbstractConnector:338 - Started SslSelectChannelConnector@0.0.0.0:8440 20 Aug 2017 05:33:18,694 INFO [main] SslContextFactory:300 - Enabled Protocols [SSLv2Hello, TLSv1, TLSv1.1, TLSv1.2] of [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2] 20 Aug 2017 05:33:18,695 INFO [main] AbstractConnector:338 - Started SslSelectChannelConnector@0.0.0.0:8441 20 Aug 2017 05:33:18,695 INFO [main] AmbariServer:558 - ********* Started Server ********** 20 Aug 2017 05:33:18,695 INFO [main] ActionManager:77 - Starting scheduler thread 20 Aug 2017 05:33:18,696 INFO [main] ServerActionExecutor:146 - Starting Server Action Executor thread... 20 Aug 2017 05:33:18,698 INFO [main] ServerActionExecutor:173 - Server Action Executor thread started. 20 Aug 2017 05:33:18,698 INFO [main] AmbariServer:561 - ********* Started ActionManager ********** 20 Aug 2017 05:33:18,698 INFO [main] ExecutionScheduleManager:201 - Starting scheduler 20 Aug 2017 05:33:18,762 INFO [MLog-Init-Reporter] MLog:212 - MLog clients using slf4j logging. 20 Aug 2017 05:33:18,845 INFO [main] C3P0Registry:212 - Initializing c3p0-0.9.5.2 [built 08-December-2015 22:06:04 -0800; debug? true; trace: 10] 20 Aug 2017 05:33:18,887 INFO [main] StdSchedulerFactory:1184 - Using default implementation for ThreadExecutor 20 Aug 2017 05:33:18,906 INFO [main] SchedulerSignalerImpl:61 - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl 20 Aug 2017 05:33:18,906 INFO [main] QuartzScheduler:240 - Quartz Scheduler v.2.2.1 created. 20 Aug 2017 05:33:18,907 INFO [main] JobStoreTX:670 - Using thread monitor-based data access locking (synchronization). 20 Aug 2017 05:33:18,908 INFO [main] JobStoreTX:59 - JobStoreTX initialized. 20 Aug 2017 05:33:18,909 INFO [main] QuartzScheduler:305 - Scheduler meta-data: Quartz Scheduler (v2.2.1) 'ExecutionScheduler' with instanceId 'NON_CLUSTERED' Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally. NOT STARTED. Currently in standby mode. Number of jobs executed: 0 Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 5 threads. Using job-store 'org.quartz.impl.jdbcjobstore.JobStoreTX' - which supports persistence. and is not clustered. 20 Aug 2017 05:33:18,909 INFO [main] StdSchedulerFactory:1339 - Quartz scheduler 'ExecutionScheduler' initialized from an externally provided properties instance. 20 Aug 2017 05:33:18,909 INFO [main] StdSchedulerFactory:1343 - Quartz scheduler version: 2.2.1 20 Aug 2017 05:33:18,909 INFO [main] QuartzScheduler:2311 - JobFactory set to: org.apache.ambari.server.state.scheduler.GuiceJobFactory@14a97e7a 20 Aug 2017 05:33:18,910 INFO [main] AmbariServer:564 - ********* Started Scheduled Request Manager ********** 20 Aug 2017 05:33:18,914 INFO [main] AmbariServer:567 - ********* Started Services ********** 20 Aug 2017 05:33:18,924 INFO [AmbariServerAlertService STARTING] AmbariServerAlertService:257 - Scheduled server alert ambari_server_agent_heartbeat to run every 2 minutes 20 Aug 2017 05:33:18,925 INFO [AmbariServerAlertService STARTING] AmbariServerAlertService:257 - Scheduled server alert ambari_server_stale_alerts to run every 5 minutes 20 Aug 2017 05:33:43,587 ERROR [qtp-ambari-agent-54] HeartBeatHandler:198 - CurrentResponseId unknown for hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net - send register command 20 Aug 2017 05:33:43,668 INFO [qtp-ambari-agent-55] HeartBeatHandler:402 - agentOsType = redhat7 20 Aug 2017 05:33:43,739 INFO [qtp-ambari-agent-55] HostImpl:285 - Received host registration, host=[hostname=hdp25-node1,fqdn=hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net,domain=wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=redhat,osversion=7.3,osfamily=redhat,memory=16416608,uptime_hours=0,mounts=(available=25310412,mountpoint=/,used=7714864,percent=24%,size=33025276,device=/dev/sda2,type=xfs)(available=8198368,mountpoint=/dev,used=0,percent=0%,size=8198368,device=devtmpfs,type=devtmpfs)(available=8208304,mountpoint=/dev/shm,used=0,percent=0%,size=8208304,device=tmpfs,type=tmpfs)(available=8199824,mountpoint=/run,used=8480,percent=1%,size=8208304,device=tmpfs,type=tmpfs)(available=403144,mountpoint=/boot,used=105436,percent=21%,size=508580,device=/dev/sda1,type=xfs)(available=29055312,mountpoint=/mnt/resource,used=2146336,percent=7%,size=32895696,device=/dev/sdb1,type=ext4)] , registrationTime=1503207223668, agentVersion=2.2.2.0 20 Aug 2017 05:33:43,740 INFO [qtp-ambari-agent-55] TopologyManager:311 - TopologyManager.onHostRegistered: Entering 20 Aug 2017 05:33:43,740 INFO [qtp-ambari-agent-55] TopologyManager:313 - TopologyManager.onHostRegistered: host = hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net is already associated with the cluster or is currently being processed 20 Aug 2017 05:33:43,859 INFO [qtp-ambari-agent-55] HeartBeatHandler:469 - Recovery configuration set to RecoveryConfig{, type=AUTO_START, maxCount=6, windowInMinutes=60, retryGap=5, maxLifetimeCount=1024, disabledComponents=, enabledComponents=METRICS_COLLECTOR} 20 Aug 2017 05:33:55,422 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component METRICS_GRAFANA of service AMBARI_METRICS of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,435 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component FLUME_HANDLER of service FLUME of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,451 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component METRICS_MONITOR of service AMBARI_METRICS of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,462 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component SPARK_JOBHISTORYSERVER of service SPARK of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,477 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component HST_AGENT of service SMARTSENSE of cluster hdp24 has changed from UNKNOWN to STARTED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,492 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component SUPERVISOR of service STORM of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,501 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component NODEMANAGER of service YARN of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,508 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component KNOX_GATEWAY of service KNOX of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,515 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component ZOOKEEPER_SERVER of service ZOOKEEPER of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,521 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component NAMENODE of service HDFS of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,527 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component DATANODE of service HDFS of cluster hdp24 has changed from UNKNOWN to INSTALLED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:33:55,536 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component HST_SERVER of service SMARTSENSE of cluster hdp24 has changed from UNKNOWN to STARTED at host hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net according to STATUS_COMMAND report 20 Aug 2017 05:35:18,967 INFO [Thread-21] AbstractPoolBackedDataSource:212 - Initializing c3p0 pool... com.mchange.v2.c3p0.ComboPooledDataSource [ acquireIncrement -> 3, acquireRetryAttempts -> 30, acquireRetryDelay -> 1000, autoCommitOnClose -> false, automaticTestTable -> null, breakAfterAcquireFailure -> false, checkoutTimeout -> 0, connectionCustomizerClassName -> null, connectionTesterClassName -> com.mchange.v2.c3p0.impl.DefaultConnectionTester, contextClassLoaderSource -> caller, dataSourceName -> z8kflt9p1yig0g5jqyczv|7767c18d, debugUnreturnedConnectionStackTraces -> false, description -> null, driverClass -> org.postgresql.Driver, extensions -> {}, factoryClassLocation -> null, forceIgnoreUnresolvedTransactions -> false, forceSynchronousCheckins -> false, forceUseNamedDriverClass -> false, identityToken -> z8kflt9p1yig0g5jqyczv|7767c18d, idleConnectionTestPeriod -> 50, initialPoolSize -> 3, jdbcUrl -> jdbc:postgresql://localhost/ambari, maxAdministrativeTaskTime -> 0, maxConnectionAge -> 0, maxIdleTime -> 0, maxIdleTimeExcessConnections -> 0, maxPoolSize -> 5, maxStatements -> 0, maxStatementsPerConnection -> 120, minPoolSize -> 1, numHelperThreads -> 3, preferredTestQuery -> select 0, privilegeSpawnedThreads -> false, properties -> {user=******, password=******}, propertyCycle -> 0, statementCacheNumDeferredCloseThreads -> 0, testConnectionOnCheckin -> true, testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, userOverrides -> {}, usesTraditionalReflectiveProxies -> false ] 20 Aug 2017 05:35:19,044 INFO [Thread-21] JobStoreTX:861 - Freed 0 triggers from 'acquired' / 'blocked' state. 20 Aug 2017 05:35:19,061 INFO [Thread-21] JobStoreTX:871 - Recovering 0 jobs that were in-progress at the time of the last shut-down. 20 Aug 2017 05:35:19,061 INFO [Thread-21] JobStoreTX:884 - Recovery complete. 20 Aug 2017 05:35:19,062 INFO [Thread-21] JobStoreTX:891 - Removed 0 'complete' triggers. 20 Aug 2017 05:35:19,062 INFO [Thread-21] JobStoreTX:896 - Removed 0 stale fired job entries. 20 Aug 2017 05:35:19,068 INFO [Thread-21] QuartzScheduler:575 - Scheduler ExecutionScheduler_$_NON_CLUSTERED started.
Created 08-20-2017 07:19 AM
Your /etc/hosts entry looks wrong is should look like this you advised never change the first 2 lines for IPV4 and IPV6
127.0.0.1 localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.0.0.4 hdp25-node1.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 10.0.0.5 hdp25-node2.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net 10.0.0.6 hdp25-node3.wulme4ci31tu3lwdofvykqwgkh.bx.internal.cloudapp.net
Can you change that on all the hosts and retry that could be the issue with connection lost !
Why does you repolist output have exclamation marks !!!!!??
!HDP-2.4 !HDP-UTILS-1.1.0.20 !Updates-ambari-2.2.2.0
Can you copy /paste the contents of the below files
cat /etc/yum.repos.d/ambari.repo cat /etc/yum.repos.d/hdp.repo
Created 08-20-2017 12:20 PM
I did the change in /etc/hosts as per your suggestion. Also the firewall service was restarted due to the VM restart, so I had to turn that down. Now the Ambari UI is connecting.
I have started services and tested few HDFS commands, also Hive commands but, I cannot see YARN resource UI even if the YARN service is up and running. Please suggest solution for that.
About ! in repo names, the RHEL site says it is due to invalid metadata.
contents of /etc/yum.repos.d/ambari.repo
[anup@hdp25-node1 ~]$ cat /etc/yum.repos.d/ambari.repo #VERSION_NUMBER=2.2.2.0-460 [Updates-ambari-2.2.2.0] name=ambari-2.2.2.0 - Updates baseurl=http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0 gpgcheck=1 gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1
/etc/yum.repos.d/HDP.repo
[anup@hdp25-node1 ~]$ cat /etc/yum.repos.d/hdp.repo cat: /etc/yum.repos.d/hdp.repo: No such file or directory [anup@hdp25-node1 ~]$ cat /etc/yum.repos.d/HDP.repo [HDP-2.4] name=HDP-2.4 baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.4.3.0 path=/ enabled=1 gpgcheck=0
Created 08-20-2017 12:30 PM
Good progress can you paste here the screenshot? Now that the first problem was solved I would advice you accept my answer and open a new thread with the YARN UI issue otherwise this thread will be too long to follow.
Thanks
Created 08-20-2017 12:48 PM
Yes closing this one is a good idea. I have accepted an answer hope that means the thread is closed.
Thanks
Created 08-20-2017 01:01 PM
Yes open one for the RM/YARN UI people usually ignore a thread that has been there for ages, with long thread.