Member since
03-20-2019
4
Posts
0
Kudos Received
0
Solutions
12-27-2017
01:45 AM
Additional Version : ambari-server 2.5.2 HDP 2.4 Compenent Location: Ambari-server install on dwhdpm01p.xxx.com.
... View more
12-27-2017
01:41 AM
Description: We found follow error message from ambari alert, It looks like could not connect namenode server (dwhdpm01p.xxx.com) through following port (8019,50070).I have no idea, Please gives me some advice. Check List : 1.netstat [root@dwhdpm01p ~]# netstat -anp |grep 50070 tcp 0 0 192.168.1.160:50070 0.0.0.0:* LISTEN 10670/java tcp 0 0 192.168.1.160:50070 192.168.1.161:54986 TIME_WAIT - tcp6 0 0 172.16.7.40:8080 172.20.249.15:50070 ESTABLISHED 22260/java [root@dwhdpm01p ~]# netstat -anp |grep 8019 tcp 0 0 192.168.1.160:8019 0.0.0.0:* LISTEN 10246/java 2.telnet [root@dwhdpm01p ~]# telnet 192.168.1.160 50070 Trying 192.168.1.160... Connected to 192.168.1.160. Escape character is '^]'. 3.selinux close [root@dwhdpm01p ~]# getenforce Disabled 4.Iptables close [root@dwhdpm01p ~]# systemctl status iptables ● iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: inactive (dead) 5.Webhdfs is work curl -i "http://dwhdpm01p.xxx.com:50070/webhdfs/v1/user/admin/?op=LISTSTATUS" Ambari Alert : 1.ZooKeeper Failover Controller Process
Connection failed: [Errno 111] Connection refused to dwhdpm01p.xxx.com:8019
2. NameNode Web UI
Connection failed to http://dwhdpm01p.xxx.com:50070 (timed out)
3. Ambari Server Alerts
There are 41 stale alerts from 1 host(s):
dwhdpm01p.xxx.com
[Ambari Agent Distro/Conf Select Versions (8d 23h 37m),
App Timeline Web UI (8d 23h 33m),
DataNode Health Summary (8d 23h 33m),
Grafana Web UI (8d 23h 32m),
HDFS Capacity Utilization (8d 23h 33m),
HDFS Pending Deletion Blocks (8d 23h 33m),
HDFS Storage Capacity Usage (Daily) (9d 5h 47m),
HDFS Storage Capacity Usage (Weekly) (9d 21h 47m),
HDFS Upgrade Finalized State (8d 23h 33m),
History Server CPU Utilization (8d 23h 37m),
History Server Process (8d 23h 32m),
History Server RPC Latency (8d 23h 37m),
History Server Web UI (8d 23h 33m),
Hive Metastore Process (8d 23h 34m),
Host Disk Usage (8d 23h 33m),
Metrics Collector - Auto-Restart Status (8d 23h 33m),
Metrics Collector - HBase Master CPU Utilization (8d 23h 37m),
Metrics Collector - HBase Master Process (8d 23h 33m),
Metrics Collector Process (8d 23h 33m),
Metrics Monitor Status (8d 23h 33m),
NameNode Blocks Health (8d 23h 33m),
NameNode Client RPC Processing Latency (Daily) (9d 5h 47m),
NameNode Client RPC Processing Latency (Hourly) (8d 23h 37m),
NameNode Client RPC Queue Latency (Daily) (9d 5h 47m),
NameNode Client RPC Queue Latency (Hourly) (8d 23h 37m),
NameNode Directory Status (8d 23h 33m),
NameNode Heap Usage (Daily) (9d 5h 47m),
NameNode Heap Usage (Weekly) (9d 21h 47m),
NameNode Host CPU Utilization (8d 23h 37m),
NameNode Last Checkpoint (8d 23h 33m),
NameNode RPC Latency (8d 23h 33m),
NameNode Service RPC Processing Latency (Daily) (9d 5h 47m),
NameNode Service RPC Processing Latency (Hourly) (8d 23h 37m),
NameNode Service RPC Queue Latency (Daily) (9d 5h 47m),
NameNode Service RPC Queue Latency (Hourly) (8d 23h 37m),
NameNode Web UI (8d 23h 33m),
NodeManager Health Summary (8d 23h 33m),
ResourceManager CPU Utilization (8d 23h 37m),
ResourceManager RPC Latency (8d 23h 37m),
ResourceManager Web UI (8d 23h 33m),
ZooKeeper Failover Controller Process (8d 23h 33m)]
... View more
Labels:
12-18-2017
10:02 AM
Hi All , Description: We found error message from ambari alert "ZooKeeper Failover Controller Process". ENV Description: 1.Namenode HA Enable 2.Iptables Disabled 3.Selinux Disabled 4.Namemode HA Fail. 5.IPv6 Enabled 6.Telnet 192.168.1.160 8020 is work 7.netstat -apn |grep 8020 has process ERROR: 2017-12-18 09:40:01,115 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(211)) - Transport-level exception trying to monitor health of NameNode at dwhdpm01p.xxxx.com/192.168.1.160:8020: java.net.ConnectException: Connection refused Call From dwhdpm01p.xxxx.com/192.168.1.160 to dwhdpm01p.xxxx.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
... View more
Labels:
12-29-2015
08:55 AM
Step 1. vi /etc/hadoop/conf/yarn-site.xml Add Follow line <property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
<description>If we want to launch more than one RM in single node, we need this configuration</description>
</property>
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm2</value> <description>If we want to launch more than one RM in single node, we need this configuration</description>
</property> <!-- RM1 Configs -->
<property>
<name>yarn.resourcemanager.address.rm1</name> <value>sandbox.hortonworks.com:23140</value> </property> <property>
<name>yarn.resourcemanager.scheduler.address.rm1</name> <value>sandbox.hortonworks.com:23130</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name> <value>sandbox.hortonworks.com:23188</value> </property>
<property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>sandbox.hortonworks.com:23125</value> </property>
<property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>sandbox.hortonworks.com:23141</value>
</property>
<!-- RM2 configs --> <property> <name>yarn.resourcemanager.address.rm2</name> <value>sandbox.hortonworks.com:33140</value> </property> <property>
<name>yarn.resourcemanager.scheduler.address.rm2</name> <value>sandbox.hortonworks.com:33130</value>
</property> <property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>sandbox.hortonworks.com:33188</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>sandbox.hortonworks.com:33125</value> </property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>sandbox.hortonworks.com:33141</value> </property>
<property>
<name>yarn.resourcemanager.ha.enabled</name> <value>true</value>
</property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name> <value>true</value>
</property>
<property> <name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property> <name>yarn.resourcemanager.zk-address</name>
<value>sandbox.hortonworks.com:2181</value>
<description>For multiple zk services, separate them with comma</description>
</property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yarn-cluster</value> </property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name> <value>/yarn-leader-election</value>
<description>Optional setting. The default value is
/yarn-leader-election</description> </property>
<property>
<name>yarn.resourcemanager.cluster-id</name> <value>yarn-cluster</value> </property> <property>
<name>yarn.resourcemanager.zk-state-store.address</name> <value>sandbox.hortonworks.com:2181</value>
</property> Manual Start ZooKeeper.Start hdfs Start yarn .. . [yarn@sandbox ~]$ yarn rmadmin -getServiceState rm1
15/12/29 08:43:30 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/192.168.182.145:23141. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From sandbox.hortonworks.com/192.168.182.145 to sandbox.hortonworks.com:23141 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [yarn@sandbox ~]$ yarn rmadmin -getServiceState rm2 standby
... View more