About ArpitAgarwal

DhanoojK · ‎08-10-2023

Try this option: [serviceaccount@edgenode ~]$ hdfs getconf -confKey dfs.nameservices hadoopcdhnn [serviceaccount@edgenode ~]$ hdfs getconf -confKey dfs.ha.namenodes.hadoopcdhnn namenode5605,namenode5456 [serviceaccount@edgenode ~]$ hdfs haadmin -getServiceState namenode5605 active [serviceaccount@edgenode ~]$ hdfs haadmin -getServiceState namenode5456 standby

AshishKr · ‎05-02-2023

Please add zkcli command to login in znode and remove directory. Hope you understand. zookeeper-client -server <zookeeper-server-host>:2181 (May use sudo if permission issue or login from HDFS User) ls / or ls /hadoop-ha (If you don't see any znode /hadoop-ha in ZK znode list, skip the step below) rmr /hadoop-ha/nameservice1

pravin1406 · ‎09-08-2021

This whole series is really insightful and helpful!

VidyaSargur · ‎03-22-2021

Hi @Priya09, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.

horse · ‎12-01-2020

Any details, explanation or pointers on why this is the case?

Prav · ‎08-02-2019

Thanks, to be on the same page taking help of below scenario: hdfs snapshottable location /a/b/ has a file c which is snapshotted. Consider a scenario where c is deleted from hdfs using cli hdfs -rm -r -skipTrash (NN transaction happened and hdfs cli command doesn't show up the file anymore) and then a new file is created with same content/size and name. - What gets stored in hdfs? whats the delta that snapshot add in hdfs in this case? --> is it just that snapshot still holds c as block in hdfs in addition to the same file that was created in hdfs --> NN resource used to maintain both of their metadata in heap? is this all or there is more to it . Regards

ArpitAgarwal · ‎02-27-2018

Building Apache Tez with Apache Hadoop 2.8.0 or later fails due to client/server jar separation in Hadoop [1]. The build fails with the following error: [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /src/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30] cannot find symbol symbol: class DistributedFileSystem location: package org.apache.hadoop.hdfs [ERROR] /src/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[680,50] cannot find symbol symbol: class DistributedFileSystem location: class org.apache.tez.client.TestTezClientUtils [ERROR] /src/tez/ecosystem/tez/tez-api/src/test/java/org/apache/tez/common/TestTezCommonUtils.java:[62,42] cannot access org.apache.hadoop.hdfs.DistributedFileSystem To get Tez to compile successfully, you will need to use the new hadoop28 profile introduced by TEZ-3690 [2]. E.g. here is how you compile Tez with Apache Hadoop 3.0.0: mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Phadoop28 -Dhadoop.version=3.0.0 References: 1. HDFS-6200: Create a separate jar for hdfs-client 2. TEZ-3690: Tez on hadoop 3 build failed due to hdfs client/server jar separation.

ArpitAgarwal · ‎06-22-2017

The HDFS NameNode ensures that each block is sufficiently replicated. When it detects the loss of a DataNode, it instructs remaining nodes to maintain adequate replication by creating additional block replicas. For each lost replica, the NameNode picks a (source, destination) pair where the source is an available DataNode with another replica of the block and the destination is the target for the new replica. The re-replication work can be massively parallelized in large clusters since the replica distribution is randomized. In this article, we estimate a lower bound for the recovery time. Simplifying Assumptions The maximum IO bandwidth of each disk is 100MB/s (reads + writes). This is true for the vast majority of clusters that use spinning disks. The aggregate IO capacity of the cluster is limited by disk and not the network. This is not always true but helps us establish lower bounds without discussing network topologies. Block replicas are uniformly distributed across the cluster and disk usage is uniform. True if the HDFS balancer was run recently. Theoretical Lower Bound Let's assume the cluster has n nodes. Each each node has p disks, and the usage of each disk is c TeraBytes. The data usage of each node is thus (p ⋅ c) TB. The amount of data data transfer needed for recovery is twice the capacity of the lost DataNode as each replica must be read once from a source disk and written once to the target disk. Data transfer during recovery = 2 ⋅ (Node Capacity) = (2 ⋅ p ⋅ c) TB = (2 ⋅ p ⋅ c ⋅ 1,000,000) MB The re-replication rate is the limited by the available aggregate IO bandwidth in the cluster: Cluster aggregate IO bandwidth = (Disk IO bandwidth) ⋅ (Number of disks) = (100 ⋅ n ⋅ p) MB/s Thus Minimum Recovery Time = (Data transfer during recovery) / (Cluster aggregate IO bandwidth) = (2 ⋅ p ⋅ c ⋅ 1,000,000) / (100 ⋅ n ⋅ p) = (20,000 ⋅ c/n) seconds. where: c = Mean usage of each disk in TB. n = Number of DataNodes in the cluster. This is the absolute best case with no other load, no network bandwidth limits, and a perfectly efficient scheduler. E.g. In a 100 node cluster where each disk has 4TB of data, recovery from the loss of a DataNode must take at least (20,000 ⋅ 4) / 100 = 800 seconds or approximately 13 minutes. Clearly, the cluster size bounds the recovery time. Disk capacities being equal, a 1000 node cluster can recover 10x faster than a 100 node cluster. A More Practical Lower Bound The theoretical lower bound assumes that block re-replications can be instantaneously scheduled across the cluster. It also assumes that all cluster IO capacity is available for re-replication whereas in practice application reads and writes also consume IO capacity. The NameNode schedules 2 outbound replication streams per DataNode, per heartbeat interval to throttle re-replication traffic. This throttle allows DataNodes to remain responsive to applications. The throttle can be adjusted via the configuration setting dfs.namenode.replication.max-streams. Let's call this m and the heartbeat interval h. Also let's assume the mean block size in the cluster is b MB. Then: Re-replication Rate = Blocks Replicated cluster-wide per heartbeat interval = (n ⋅ m/h) Blocks/s The total number of blocks to be re-replicated is the capacity of the lost node divided by the mean block size. Number of Blocks Lost = (p ⋅ c) TB / b MB = (p ⋅ c ⋅ 1,000,000/b). Thus: Recovery Time = (Number of Blocks Lost) / (Re-replication Rate) = (p ⋅ c ⋅ 1,000,000) / (b ⋅ n ⋅ m/h) = (p ⋅ c ⋅ h ⋅ 1,000,000) / (b ⋅ n ⋅ m) seconds. where: p = Number of disks per node. c = Mean usage of each disk in TB. h = Heartbeat interval (default = 3 seconds). b = Mean block size in MB. n = Number of DataNodes in the cluster. m = dfs.namenode.replication.max-streams (default = 2) Simplifying by plugging in the defaults for h and m, we get Minimum Recovery Time (seconds) = (p ⋅ c ⋅ 1,500,000) / (b ⋅ n) E.g. in the same cluster, assuming the mean block size is 128MB and each node has 8 disks, the practical lower bound on recovery time will be 3,750 seconds or ~1 hour. Reducing the Recovery Time The recovery time can be reduced by: Increasing dfs.namenode.replication.max-streams. However, setting this value too high can affect cluster performance. Note that increasing this value beyond 4 must be evaluated carefully and also requires changing the safeguard upper limit via dfs.namenode.replication.max-streams-hard-limit. Using more nodes with smaller disks. Total cluster capacity remaining the same, a cluster with more nodes and smaller disks will recover faster. Avoiding predominantly small blocks.

ArpitAgarwal · ‎04-05-2017

I'd also post this question on the Ambari track to check why Ambari didn't detect the DataNodes doing down. Also from your logs it is hard to say why the DataNode went down. I again recommend increasing the DataNode heap allocation via Ambari. Also check that your nodes are provisioned with sufficient amount of RAM.

spanchan · ‎06-20-2018

This worked.

Online	Offline
Last Visited	‎11-03-2023 01:06 PM

Member Since	‎07-30-2019 10:45 AM
Last Visited	‎11-03-2023 01:06 PM
Posts	111
Kudos received	185

Cloudera Community

Re: What is active and passive NameNode in Hadoop?

Re: NameNode heapsize is bigger then it should be.

Re: Delete old BP-* DataNode directories by hand?

Re: NameNode edit logs - purging/Best practises

Re: Hadoop 3.0 in a Virtual Box for beginners

Re: Figuring out the active name node of a remote ...

Re: two name nodes are stand by after configuring ...

Re: Scaling the HDFS NameNode (part 4) - Avoiding ...

Re: Unable to Start DataNode in kerberos cluster

Re: 2 versions of datanodes currently live

Re: Does snapshot occupy space in HDFS.

Compiling Apache Tez with Apache Hadoop 2.8.0 or l...

HDFS Recovery Time from Single DataNode Failure

Re: Datanode Failures: DataXceiver error processin...

Re: Data Node (DN) Heap Usage Alerts in Ambari