Member since
03-06-2019
104
Posts
1
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
339 | 11-23-2023 05:58 AM | |
436 | 11-23-2023 05:45 AM |
04-03-2024
09:30 AM
Hi @s198, You do not need to have hadoop file system or datanode role on the remote server. You just need to set up some hdfs gateway on the remote server and pull it using distcp. If you are using HDP or CDP, you can add the remote server as a gateway and perform distcp in the remote server. Another option is to share one of the directories in the remote server, mount it in hadoop cluster node, and perform distcp to that mounted directory.
... View more
03-22-2024
06:34 AM
Introduction: In large Hadoop clusters, efficiently managing block replication and decommissioning of DataNodes is crucial for maintaining system performance and reliability. However, updating Namenode configuration parameters to optimize these processes often requires a Namenode restart, causing downtime and potential disruptions to cluster operations. In this article, we'll explore a procedure to expedite block replication and DataNode decommissioning in HDFS without the need for a Namenode restart. Procedure: Identify Namenode Process Directory: Locate the Namenode process directory for the current active Namenode. This directory typically resides in /var/run/cloudera-scm-agent/process/ followed by a folder that looks like "###-hdfs-NAMENODE" Modify Configuration Parameters: Edit the hdfs-site.xml file in the Namenode process directory. Adjust the following parameters to the recommended values: dfs.namenode.replication.max-streams: Increase to a recommended value (e.g., 100). dfs.namenode.replication.max-streams-hard-limit: Increase to a recommended value (e.g., 200). dfs.namenode.replication.work.multiplier.per.iteration: Increase to a recommended value (e.g., 100). Apply Configuration Changes: Execute the below command to initiate the reconfiguration process #hdfs dfsadmin -reconfig namenode <namenode_address> start <namenode_address> can be found from the value of "dfs.namenode.rpc-address" from hdfs-site.xml. Verify Configuration Changes: Monitor the reconfiguration status using the command #hdfs dfsadmin -reconfig namenode <namenode_address> status Upon completion, verify that the configuration changes have been successfully applied. It would look like something as shown below: #hdfs dfsadmin -reconfig namenode namenode_hostname:8020 status Reconfiguring status for node [namenode_hostname:8020]: started at Fri Mar 22 08:15:12 UTC 2024 and finished at Fri Mar 22 08:15:12 UTC 2024. SUCCESS: Changed property dfs.namenode.replication.max-streams-hard-limit From: "40" To: "200" SUCCESS: Changed property dfs.namenode.replication.work.multiplier.per.iteration From: "10" To: "100" SUCCESS: Changed property dfs.namenode.replication.max-streams From: "20" To: "100" Revert Configuration Changes (Optional): If needed, revert to the original configuration values by repeating the above steps with the original parameter values. Conclusion: By following the outlined procedure, administrators can expedite block replication and DataNode decommissioning in HDFS without the need for a Namenode restart. This approach minimizes downtime and ensures efficient cluster management, even in environments where Namenode High Availability is not yet implemented or desired. Note: It's recommended to test configuration changes in a non-production environment before applying them to a live cluster to avoid potential disruptions. Additionally, consult the Hadoop documentation and consider any specific requirements or constraints of your cluster environment before making configuration modifications.
... View more
Labels:
01-08-2024
08:36 AM
Hi , That's you Standby Namenode (SBNN). Please verify if it's performing checkpoint ing or not. Please perform one checkpoint from Cloudera manager to get the health test clear.
... View more
01-08-2024
07:32 AM
Hi @George-Megre , Looks like your SBNN is not performing checkpoint. Can you perform one manual checkpoint now and see what's happening on SBNN side? Verify if Solr server and port are reachable ip-10-2-0-224.ec2.internal:8993.
... View more
12-24-2023
08:23 AM
Then you need to run bootstrap standby on non running Namenode or sync nn datadir folder from running nn to down nn.
... View more
12-24-2023
03:09 AM
Then some other Namenode is talking to this journal node process ? Any way you can find other Namenode which is active and also configured to talk to this. ?
... View more
12-22-2023
04:23 AM
Hi @George-Megre , I dont think HMS can write to edit log. Only Namenodes are allowed to write unless u r using some different plugin to do that. I am not too sure of that use case . Can you please run commands like below to find out who is writing to edit file ? [root@c4265-node3 current]# lsof /data/dfs/nn/current/edits_inprogress_0000000000003796588
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 1133 hdfs 324u REG 0,87 1048576 21715133 /data/dfs/nn/current/edits_inprogress_0000000000003796588
[root@c4265-node3 current]# ps -h 1133
1133 ? Sl 5:18 /usr/java/jdk1.8.0_232-cloudera/bin/java -Dproc_namenode -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.conf=/etc/krb5.conf -Xms1073741824 -Xm
[root@c4265-node3 current]#
... View more
12-21-2023
08:02 AM
Hi @George-Megre , When namenode itself is down , who is writing into edits_inprogress_0000000000011353525 as of Dec 20th ?
... View more
12-21-2023
05:14 AM
Hi @michalLi , I have been trying this in CDP PvC now but does not seem to work . Here is the behavior i see for spark history server web ui (7.1.7.2000) TLS enabled and kerberos enabled : without keytab https://172.25.42.2:18088 works fine
TLS disabled and kerberos enabled : with/without keytab http://172.25.42.2:18088 is failing for 401 Auth in Mac OS/Chrome
... View more
12-20-2023
10:55 PM
Hi @George-Megre , Looks like there is lot of mismatch . I could see below edit file being updated by other Namenode or some other Namenode. -rw-r--r-- 1 hdfs hdfs 17825792 Dec 20 12:21 edits_inprogress_0000000000011353525 Could you please verify and make sure no other NN is using these JNs . I would request you to attach both Namenodes' data directories along with all 3 JN's data directories.
... View more