Support Questions

wfloyd · ‎10-29-2015

When a single drive fails on a worker node in HDFS, can this adversely affect performance of jobs running on this node? Or does the DataNode process quickly mark the drive and its HDFS blocks as "unusable".

If this could cause a performance impact, how can our customers monitor for these drive failures in order to take corrective action? Ambari Alerts?

aengineer · ‎11-02-2015

@Wes Floyd

Since there are multiple questions here I am going to answer each question individually.

> When a single drive fails on a worker node in HDFS, can this adversely affect performance of jobs running on this node?

The answer to this question is it depends. If this node is running a job that is accessing blocks on the failed volume, then yes. it is also possible that the job would be treated as failed if the dfs.datanode.failed.volumes.tolerated is not greater than 0. If it is not a value greater than zero, then HDFS treats a loss of a volume as catastrophic and marks the datanode as failed. If this is set to a value greater than zero, then node will work well until we lose more volumes.

> If this could cause a performance impact, how can our customers monitor for these drive failures in order to take corrective action?

Now this is a hard question to answer without further details. I am tempted to answer that the performance benefit you are going to get by monitoring and relying on a human being to take a corrective action is very doubtful. YARN / MR or whatever execution engine you are using is probably going to be much more efficient at re-scheduling your jobs.

>Or does the DataNode process quickly mark the drive and its HDFS blocks as "unusable".

Datanode does mark the volume as failed , and namenode will learn that all the blocks on that failed volume are not available on that datanode any more. This happens via something called "block reports". Once namenode learns that data node has lost the replica of a block then namenode will initiate appropriate replication. Since namenode knows about the loss of blocks, further jobs that need access to those block would most probably not be scheduled on that node. This again depends on the scheduler and its policies.

View solution in original post

nsabharwal · ‎10-29-2015

@Wes Floyd

Infrastructure monitoring tools plays key role here.

Regarding, Performance

Just found this usefule Blog

Modern servers have a lot of disks. What’s the impact of losing a single disk when you have 12 3TB drive in each node?

A: When a single drive fails when Hadoop is configured in its default state, the ENTIRE NODE gets taken offline. Back when servers typically had 6 x 1.5TB drives in them, losing a single disk would cause the loss of 0.02% of total storage in a typical 10PB, three-replica setup. With today’s hardware — typically 12 x 3TB drives per node, losing a single disk results in the loss of five times as much data.

TerryP · ‎10-29-2015

Due to the wide variety of drive configurations utilized in DataNodes, the failure tolerance for disks is configurable.  The dfs.datanode.failed.volumes.tolerated property in hdfs-site.xml allows you to specify how many volumes are lost before the DataNode is marked offline. Once the node is offline, the Namenode will use information in the block reports to create new replicas for any blocks left in an under-replicated state by the failure. 

You can tune the HDFS DataNode storage alert warning setting to account for the minimum amount of storage you desire to have available. This will give you a warning before things go critical.  Individual disk monitoring is usually handled by an enterprise-level monitoring tool such as OpenView, etc.

aengineer · ‎11-02-2015

@Wes Floyd

Since there are multiple questions here I am going to answer each question individually.

> When a single drive fails on a worker node in HDFS, can this adversely affect performance of jobs running on this node?

The answer to this question is it depends. If this node is running a job that is accessing blocks on the failed volume, then yes. it is also possible that the job would be treated as failed if the dfs.datanode.failed.volumes.tolerated is not greater than 0. If it is not a value greater than zero, then HDFS treats a loss of a volume as catastrophic and marks the datanode as failed. If this is set to a value greater than zero, then node will work well until we lose more volumes.

> If this could cause a performance impact, how can our customers monitor for these drive failures in order to take corrective action?

Now this is a hard question to answer without further details. I am tempted to answer that the performance benefit you are going to get by monitoring and relying on a human being to take a corrective action is very doubtful. YARN / MR or whatever execution engine you are using is probably going to be much more efficient at re-scheduling your jobs.

>Or does the DataNode process quickly mark the drive and its HDFS blocks as "unusable".

Datanode does mark the volume as failed , and namenode will learn that all the blocks on that failed volume are not available on that datanode any more. This happens via something called "block reports". Once namenode learns that data node has lost the replica of a block then namenode will initiate appropriate replication. Since namenode knows about the loss of blocks, further jobs that need access to those block would most probably not be scheduled on that node. This again depends on the scheduler and its policies.

Cloudera Community

Support Questions

How to Alert for HDFS Disk Failures

NiFi ReportingTask Disk+Memory Monitoring With Ema...

Explaining "block missing" and "block corruption" ...

HDFS DN is writing into root-vg after disk failure

Change ambari alert threshold values for disks

Read/Write throughput HDFS JBOD disk

How to alert when datanode has a disk/volume failu...

HDFS Recovery Time from Single DataNode Failure

How to change Ambari alert threshold values for di...

HDFS Balancer: Balancing Data Between Disks on a D...

Volume failure reported while disks seem fine