Member since
08-08-2017
1652
Posts
30
Kudos Received
11
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2186 | 06-15-2020 05:23 AM | |
| 18832 | 01-30-2020 08:04 PM | |
| 2345 | 07-07-2019 09:06 PM |
01-19-2021
09:10 AM
we have ambari cluster , HDP version `2.6.5` cluster include management of two name-node ( one is active and the secondary is standby ) and 65 datanode machines we have problem with the standby name-node that not started and from the namenode logs we can see the following 2021-01-01 15:19:43,269 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode. java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412. at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215) for now the active namenode is up but the standby name node is down regarding to java.io.IOException: There appears to be a gap in the edit log. We expected txid 90247527115, but got txid 90247903412. what is the preferred solution to fix this problem?
... View more
Labels:
- Labels:
-
HDFS
01-17-2021
09:19 AM
we installed small HDP cluster with one data-node machine HDP version is `2.6.5` and ambari version is `2.6.1` so this is new cluster that contain two name-node with only one data-node ( worker machine ) the interesting behavior that we see is that increasing of `under replica` on ambari dashboard , for now the number is `15000` under replica blocks as we know the most root cause of this problem is network issues between name node to data-node but this isn't the case in our hadoop cluster we can also decrease the under replica by the following procedure su - <$hdfs_user> bash-4.1$ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files -bash-4.1$ for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done but we not want to do it because under replica problem should not happens from beginning and maybe need to tune some HDFS parameters , but we not sure about this please let us know about any advice that can help us
... View more
Labels:
- Labels:
-
HDFS
10-09-2020
12:59 AM
you said that not need to run it , but the post that I mentioned say to run it , so what is right
... View more
09-13-2020
09:17 AM
hi all We are performing now the change hostname configuration on production cluster according to the document - https://docs.cloudera.com/HDPDocuments/Ambari-2.7.0.0/administering-ambari/content/amb_changing_host_names.html The last stage is talking about – "in case NameNode HA enabled , then need to run the following command on one of the name node" hdfs zkfc -formatZK -force since we have active name node and standby name node we assume that our namenode is HA enable ? but we want to understand what are the risks when doing the following cli on one of the namenode hdfs zkfc -formatZK -force is the below command is safety to run without risks ?
... View more
Labels:
- Labels:
-
Ambari Blueprints
09-13-2020
09:08 AM
thank you for the post but another question - according to the document - https://docs.cloudera.com/HDPDocuments/Ambari-2.7.0.0/administering-ambari/content/amb_changing_host_names.html The last stage is talking about – in case NameNode HA enabled , then need to run the following command on one of the name node hdfs zkfc -formatZK -force thank you for the post but since we have active name node and standby name node we assume that our namenode is HA enable example from our cluster but we want to understand what are the risks when doing the following cli on one of the namenode hdfs zkfc -formatZK -force is the below command is safety to run without risks ?
... View more
09-08-2020
01:42 PM
We have HDP cluster version `2.6.5` and ambari `2.6.1` version
Cluster include 3 masters machines , and 211 data-nodes machines ( workers machines ) , all machines are `rhel 7.2` version
Example
master1.sys77.com , master2.sys77.com , master3.sys77.com …
And data nodes machines as
worker01.sys77.com , worker02.sys77.com ----> worker211.sys77.com
Now we want to change the domain name to `bigdata.com` instead of `sys77.com`
What is the procedure to replace the `domain name` (`sys77.com`) for Hadoop cluster ? ( HDP cluster with ambari )
... View more
Labels:
08-13-2020
02:04 PM
another question - lets say the last snapshot is corrupted , then how zookeeper know to take the good snapshot before the last ?
... View more
08-13-2020
01:59 PM
can you also explain the differences between snapshot to log in zookeeper under Version-2 ?
... View more
08-13-2020
01:57 PM
so if you not recommended on 3 backup ( I feel you recommended more then 3 ) , then what is the count of backup that we can sleep well -:)
... View more
08-13-2020
12:45 PM
ZooKeeper server creates snapshot and log files, but never deletes them. So we need to care about the retention policy. How to decide on the right amount of remaining Zookeeper snapshot files? Need to say that ZooKeeper server itself only needs the latest complete fuzzy snapshot and the log files from the start of that snapshot. But since ZooKeeper creates a backup of snapshot file, how many ZooKeeper snapshot backups do we need to retain? Sometimes snapshots can be corrupted, so the backup of snapshot files should take this into consideration. In our ZooKeeper server we saw that snapshot backup is created each day. Example of snapshot file from my ZooKeeper server: -rw-r--r-- 1 ZooKeeper hadoop 458138861 Aug 10 07:12 snapshot.19000329d1 -rw-r--r-- 1 ZooKeeper hadoop 458138266 Aug 10 07:13 snapshot.19000329de -rw-r--r-- 1 ZooKeeper hadoop 458143590 Aug 10 09:24 snapshot.1900032d7a -rw-r--r-- 1 ZooKeeper hadoop 458142983 Aug 10 09:25 snapshot.1900032d84 -rw-r--r-- 1 ZooKeeper hadoop 458138686 Aug 11 03:42 snapshot.1900034b74 -rw-r--r-- 1 ZooKeeper hadoop 458138686 Aug 12 01:51 snapshot.1900036fa3 -rw-r--r-- 1 ZooKeeper hadoop 458138079 Aug 12 03:03 snapshot.1900037196 -rw-r--r-- 1 ZooKeeper hadoop 458138686 Aug 12 03:08 snapshot.19000371c8 -rw-r--r-- 1 ZooKeeper hadoop 458138432 Aug 12 03:09 snapshot.19000371de -rw-r--r-- 1 ZooKeeper hadoop 458138091 Aug 12 12:02 snapshot.1900038053 -rw-r--r-- 1 ZooKeeper hadoop 458138091 Aug 12 18:04 snapshot.1900038a39 -rw-r--r-- 1 ZooKeeper hadoop 458138091 Aug 13 13:01 snapshot.190003a923 -rw-r--r-- 1 ZooKeeper hadoop 2 Aug 13 13:01 currentEpoch -rw-r--r-- 1 ZooKeeper hadoop 67108880 Aug 13 21:17 log.190002d2ce
... View more
Labels:
- Labels:
-
Apache Kafka