Member since
08-08-2017
1652
Posts
30
Kudos Received
11
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1964 | 06-15-2020 05:23 AM | |
| 16030 | 01-30-2020 08:04 PM | |
| 2108 | 07-07-2019 09:06 PM | |
| 8245 | 01-27-2018 10:17 PM | |
| 4664 | 12-31-2017 10:12 PM |
12-06-2017
12:13 PM
we have ambari cluster with 3 masters machines , on master01
and master03 Zk is running as all know on master02 Zk clientPort use the value 2181 , and we seen from the logs
that Zk not start because port is already in use as the following ( on master02 ) netstat -tnlpa | grep 2181
tcp 0 0 10.164.28.152:38219 10.164.28.153:2181 ESTABLISHED 51471/java
tcp 0 0 10.164.28.152:38218 10.164.28.153:2181 ESTABLISHED 51471/java
tcp 0 0 10.164.28.152:38707 10.164.27.162:2181 ESTABLISHED 51471/java
tcp6 0 0 :::2181 :::* LISTEN 24847/java
tcp6 0 0 10.164.28.152:2181 10.164.28.152:48270 ESTABLISHED 24847/java
tcp6 0 0 10.164.28.152:2181 10.164.28.152:40876 ESTABLISHED 24847/java
tcp6 0 0 10.164.28.152:39342 10.164.27.162:2181 ESTABLISHED 39008/java
tcp6 0 0 10.164.28.152:2181 10.164.27.162:53094 ESTABLISHED 24847/java
tcp6 0 0 10.164.28.152:37998 10.164.28.153:2181 ESTABLISHED 38168/java
tcp6 0 0 10.164.28.152:45319 10.164.28.153:2181 ESTABLISHED 39008/java
tcp6 0 0 10.164.28.152:39133 10.164.28.153:2181 ESTABLISHED 39008/java
tcp6 0 0 10.164.28.152:2181 10.164.28.153:33491 ESTABLISHED 24847/java
tcp6 0 0 10.164.28.152:37092 10.164.28.153:2181 ESTABLISHED 48592/java
tcp6 0 0 10.164.28.152:41413 10.164.28.153:2181 ESTABLISHED 28226/java
tcp6 0 0 10.164.28.152:40876 10.164.28.152:2181 ESTABLISHED 38168/java
so my question is is it possible to work around on this problem to use other
port as - 2182 , just to start the Zk on master02 machine ? or maybe other suggestion how to start the Zk server on master02 machine ? second , how it can be that Java in ambari cluster use this
port that is allocated to Zk , why this happened ?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
12-06-2017
10:29 AM
@Jay excellent answer ,
... View more
12-06-2017
10:10 AM
what we need to focus/capture on the log in order to get the feeling that file is corepted|?
... View more
12-06-2017
09:53 AM
we noticed that edits_inprogress_xxxxxxxx corrupted file (under /hadoop/hdfs/journal/hdfsha/current) , could be the reason why name--node not start correctly or start as standby and not as active so I just share with you this issue , and I want to find little verification how to identify edits_inprogress_xxxxxxxx corrupted file I found the following way but I need hortonworks approval for this verification my opinion is to use the file command , when file command return "data" then file is ok else , then edits_inprogress_xxxxxxxx corrupted is corrupted can we trust on this verification ? file edits_inprogress_0000000000075117774
edits_inprogress_0000000000075117774: ISO-8859 text, with very long lines, with no line terminators
# file edits_inprogress_0000000000075149670
edits_inprogress_0000000000075149670: data
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
12-06-2017
09:29 AM
@Jay , can you explain little about the file - edits_inprogress_xxxxxxx ?
... View more
12-06-2017
08:55 AM
@Jay well done , your solution is brilliant , now name-node are both up one is active and second is stand by as should be thank you so much for the time you spend on this case ,
... View more
12-05-2017
10:18 PM
the logs namenodelog.txt
... View more
12-05-2017
10:07 PM
@Jay maybe we need to focus first what block the port or why port not start
... View more
12-05-2017
10:01 PM
the picture for now is ( JournalNodes are running as well as the Zookeper Failover Controllers are running also ) second we perfrm more then twice full restart but without results
... View more
12-05-2017
09:56 PM
the errors are 2017-12-05 21:46:14,814 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [100.164.28.153:8485, 100.164.28.152:8485, 100.164.27.162:8485], stream=null))
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) I also checked that <br>telnet localhost 50070
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
... View more