Cloudera Data Analytics (CDA) Articles

Labels (2)
avatar
Cloudera Employee

Summary

It is expected that you will experience worker node data disk failures whilst managing your CDP cluster. This blog takes you through the steps that you should take to gracefully replace the failed worker node disks with the least disruption to your CDP cluster.

Investigation

Cloudera Manager Notification

One easy method to identify that you have experienced a disk failure within your cluster is with the Cloudera Manager UI. You will see the following type of error:

 

MichaelBush_0-1686391049095.png

Cloudera Manager will also track multiple disk failures:

MichaelBush_1-1686391049096.png

HDFS NameNode - DataNode Volume Failures

The failed disks within your cluster can also be observed from within the HDFS NameNode UI:

MichaelBush_2-1686391049074.png

 

 

This is also useful to quickly identify exactly which storage locations have failed.

Confirming from the Command Line

Taking the last example from HDFS NameNode - DataNode Volume Failures, we can see that /data/20 & /data/6 are both failed directories.

 

The following interaction from the Command Line on the worker node will also confirm the disk issue:

[root@<WorkerNode> ~]# ls -larth /data/20

ls: cannot access /data/20: Input/output error

[root@<WorkerNode> ~]# ls -larth /data/6

ls: cannot access /data/6: Input/output error

[root@<WorkerNode> ~]# ls -larth /data/1

total 0

drwxr-xr-x. 26 root   root   237 Sep 30 02:54 ..

drwxr-xr-x.  3 root   root    20 Oct  1 06:45 kudu

drwxr-xr-x.  3 root   root    16 Oct  1 06:46 dfs

drwxr-xr-x.  3 root   root    16 Oct  1 06:47 yarn

drwxr-xr-x.  3 root   root    29 Oct  1 06:48 impala

drwxr-xr-x.  2 impala impala   6 Oct  1 06:48 cores

drwxr-xr-x.  7 root   root    68 Oct  1 06:48 .

Resolution

Replace a disk on a Worker Node

You will have a number of roles that are running on any single worker node host. This is an example of a worker node that is showing a failed disk:

 

MichaelBush_3-1686391049104.png

Decommission the Worker Node

As there are multiple roles running on a worker node, it’s best to use the decommissioning process to gracefully remove the worker node from running services. This can be found by navigating to the host within Cloudera Manager and using “Actions > Begin Maintenance”

 

MichaelBush_4-1686391049086.png

 

 

It will then take you to the following page:

MichaelBush_5-1686391049112.png

Click “Begin Maintenance” and wait for the process to complete.

Expect this process to take hours on a busy cluster. The time the process takes to complete is dependent upon:

  • The number of regions that the HBase RegionServer is hosting
  • The number of blocks that the HDFS DataNode is hosting
  • The number of tablets that the Kudu TabletServer is hosting

Replace and Configure the disks

Once the worker node is fully decommissioned, the disks are ready to be replaced and configured physically within your datacenter by your infrastructure team.

 

Every cluster is going to have its own internal processes to configure the newly replaced disks.  Let’s go through an example of how this work can be verified for reference.

List the attached block devices

[root@<WorkerNode> ~]# lsblk

NAME             MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT

sda                8:0    0  3.7T  0 disk /data/1

sdb                8:16   0  3.7T  0 disk /data/2

sdc                8:32   0  3.7T  0 disk /data/3

sdd                8:48   0  3.7T  0 disk /data/4

sde                8:64   0  3.7T  0 disk /data/5

sdf                8:80   0  3.7T  0 disk /data/6

sdg                8:96   0  3.7T  0 disk /data/7

sdh                8:112  0  3.7T  0 disk /data/8

sdi                8:128  0  3.7T  0 disk /data/9

sdj                8:144  0  3.7T  0 disk /data/10

sdk                8:160  0  3.7T  0 disk /data/11

sdl                8:176  0  3.7T  0 disk /data/12

sdm                8:192  0  3.7T  0 disk /data/13

sdn                8:208  0  3.7T  0 disk /data/14

sdo                8:224  0  3.7T  0 disk /data/15

sdp                8:240  0  3.7T  0 disk /data/16

sdq               65:0    0  3.7T  0 disk /data/17

sdr               65:16   0  3.7T  0 disk /data/18

sds               65:32   0  3.7T  0 disk /data/19

sdt               65:48   0  3.7T  0 disk /data/20

sdu               65:64   0  3.7T  0 disk /data/21

sdv               65:80   0  3.7T  0 disk /data/22

sdw               65:96   0  3.7T  0 disk /data/23

sdx               65:112  0  3.7T  0 disk /data/24

sdy               65:128  0  1.8T  0 disk

├─sdy1            65:129  0    1G  0 part /boot

├─sdy2            65:130  0   20G  0 part [SWAP]

└─sdy3            65:131  0  1.7T  0 part

  ├─vg01-root    253:0    0  500G  0 lvm  /

  ├─vg01-kuduwal 253:1    0  100G  0 lvm  /kuduwal

  ├─vg01-home    253:2    0   50G  0 lvm  /home

  └─vg01-var     253:3    0  100G  0 lvm  /var

List the IDs of the block devices

[root@<WorkerNode> ~]# blkid

/dev/sdy1: UUID="4b2f1296-460c-4cbc-8aca-923c9309d4fe" TYPE="xfs"

/dev/sdy2: UUID="af9c4c79-21b9-4d02-9453-ede88b920c1f" TYPE="swap"

/dev/sdy3: UUID="j9n4QD-60xB-rqpQ-Ck3y-s2m0-FdSo-IGWrN9" TYPE="LVM2_member"

/dev/sdb: UUID="4865e719-e77c-4d1e-b1e0-80ae1d0d6e82" TYPE="xfs"

/dev/sdc: UUID="59ae0b91-3cfc-4c53-a02f-e20bdf0ac209" TYPE="xfs"

/dev/sdd: UUID="b80473e0-bce8-413c-9740-934e8ed7006e" TYPE="xfs"

/dev/sda: UUID="684e32c8-eeb2-4215-b861-880543b1f96b" TYPE="xfs"

/dev/sdg: UUID="0f0d12ac-7d93-4c76-9f5c-ac6b43f2eaff" TYPE="xfs"

/dev/sde: UUID="06c0e908-dd67-4a42-8615-7b7335a7e0f6" TYPE="xfs"

/dev/sdf: UUID="9346fa04-dc1a-4dcc-8233-a5cb65495998" TYPE="xfs"

/dev/sdn: UUID="8f05d1dd-94d1-4376-9409-d5683ad4c225" TYPE="xfs"

/dev/sdo: UUID="5e0413d1-0b82-4ec1-b3f9-bb072db39071" TYPE="xfs"

/dev/sdh: UUID="08063201-f252-49dd-8402-042afbea78a2" TYPE="xfs"

/dev/sdl: UUID="1e5ace85-f93c-46f7-bf65-353f774cfeaa" TYPE="xfs"

/dev/sdk: UUID="195967b5-a1a0-43bb-9a33-9cf7a36fdcb6" TYPE="xfs"

/dev/sdq: UUID="db81b056-587e-47a6-844e-2d952278324b" TYPE="xfs"

/dev/sdr: UUID="45b4cf68-6f10-4dc7-8128-c2006e7aba5d" TYPE="xfs"

/dev/sds: UUID="a8e591e9-33c8-478a-b580-aeac9ad4cf44" TYPE="xfs"

/dev/sdi: UUID="a0187ae0-7598-44c4-805c-ef253dea6e7a" TYPE="xfs"

/dev/sdm: UUID="720836d8-ddd6-406d-a33f-f1b92f9b40d5" TYPE="xfs"

/dev/sdv: UUID="df4bdd58-e8d2-4bdb-8255-b9c7fcfe8999" TYPE="xfs"

/dev/sdw: UUID="701f3516-03bc-461b-930c-ab34d0b417d7" TYPE="xfs"

/dev/sdu: UUID="5e1bd2f3-8ccc-4ba1-a0f7-bb55c8246d72" TYPE="xfs"

/dev/sdj: UUID="264b85f8-9740-418b-a811-20666a305caa" TYPE="xfs"

/dev/sdt: UUID="53f2f06e-71e9-4796-86a3-2212c0f652ea" TYPE="xfs"

/dev/sdp: UUID="e6b984c0-6d85-4df2-9a7d-cc1c87238c49" TYPE="xfs"

/dev/mapper/vg01-root: UUID="18bc42fe-dbfd-4005-8e13-6f5d2272d9a7" TYPE="xfs"

/dev/sdx: UUID="53e4023f-583a-4219-bfd2-1a94e15f34ef" TYPE="xfs"

/dev/mapper/vg01-kuduwal: UUID="a1441e2f-718b-42eb-b398-28ce20ee50ad" TYPE="xfs"

/dev/mapper/vg01-home: UUID="fbc8e522-64da-4cc3-87b6-89ea83fb0aa0" TYPE="xfs"

/dev/mapper/vg01-var: UUID="93b1537f-a1a9-4616-b79a-cab9a1e39bf1" TYPE="xfs"

View the /etc/fstab

[root@<WorkerNode> ~]# cat /etc/fstab

/dev/mapper/vg01-root   /                       xfs     defaults        0 0

UUID=4b2f1296-460c-4cbc-8aca-923c9309d4fe /boot                   xfs     defaults        0 0

/dev/mapper/vg01-home   /home                   xfs     defaults        0 0

/dev/mapper/vg01-kuduwal /kuduwal                xfs     defaults        0 0

/dev/mapper/vg01-var    /var                    xfs     defaults        0 0

UUID=af9c4c79-21b9-4d02-9453-ede88b920c1f swap                    swap    defaults        0 0

UUID=684e32c8-eeb2-4215-b861-880543b1f96b /data/1 xfs noatime,nodiratime 0 0

UUID=4865e719-e77c-4d1e-b1e0-80ae1d0d6e82 /data/2 xfs noatime,nodiratime 0 0

UUID=59ae0b91-3cfc-4c53-a02f-e20bdf0ac209 /data/3 xfs noatime,nodiratime 0 0

UUID=b80473e0-bce8-413c-9740-934e8ed7006e /data/4 xfs noatime,nodiratime 0 0

UUID=06c0e908-dd67-4a42-8615-7b7335a7e0f6 /data/5 xfs noatime,nodiratime 0 0

UUID=9346fa04-dc1a-4dcc-8233-a5cb65495998 /data/6 xfs noatime,nodiratime 0 0

UUID=0f0d12ac-7d93-4c76-9f5c-ac6b43f2eaff /data/7 xfs noatime,nodiratime 0 0

UUID=08063201-f252-49dd-8402-042afbea78a2 /data/8 xfs noatime,nodiratime 0 0

UUID=a0187ae0-7598-44c4-805c-ef253dea6e7a /data/9 xfs noatime,nodiratime 0 0

UUID=264b85f8-9740-418b-a811-20666a305caa /data/10 xfs noatime,nodiratime 0 0

UUID=195967b5-a1a0-43bb-9a33-9cf7a36fdcb6 /data/11 xfs noatime,nodiratime 0 0

UUID=1e5ace85-f93c-46f7-bf65-353f774cfeaa /data/12 xfs noatime,nodiratime 0 0

UUID=720836d8-ddd6-406d-a33f-f1b92f9b40d5 /data/13 xfs noatime,nodiratime 0 0

UUID=8f05d1dd-94d1-4376-9409-d5683ad4c225 /data/14 xfs noatime,nodiratime 0 0

UUID=5e0413d1-0b82-4ec1-b3f9-bb072db39071 /data/15 xfs noatime,nodiratime 0 0

UUID=e6b984c0-6d85-4df2-9a7d-cc1c87238c49 /data/16 xfs noatime,nodiratime 0 0

UUID=db81b056-587e-47a6-844e-2d952278324b /data/17 xfs noatime,nodiratime 0 0

UUID=45b4cf68-6f10-4dc7-8128-c2006e7aba5d /data/18 xfs noatime,nodiratime 0 0

UUID=a8e591e9-33c8-478a-b580-aeac9ad4cf44 /data/19 xfs noatime,nodiratime 0 0

UUID=53f2f06e-71e9-4796-86a3-2212c0f652ea /data/20 xfs noatime,nodiratime 0 0

UUID=5e1bd2f3-8ccc-4ba1-a0f7-bb55c8246d72 /data/21 xfs noatime,nodiratime 0 0

UUID=df4bdd58-e8d2-4bdb-8255-b9c7fcfe8999 /data/22 xfs noatime,nodiratime 0 0

UUID=701f3516-03bc-461b-930c-ab34d0b417d7 /data/23 xfs noatime,nodiratime 0 0

UUID=53e4023f-583a-4219-bfd2-1a94e15f34ef /data/24 xfs noatime,nodiratime 0 0

Recommission the Worker Node

Once the disk(s) has been suitably replaced, it’s time to use the recommissioning process to gracefully reintroduce the worker node back into the cluster. This can be found by navigating to the host within Cloudera Manager and using “Actions > End Maintenance”

 

MichaelBush_6-1686391049101.png

After the node has completed its recommission cycle, follow the guidance in the next sections to perform local disk rebalancing where appropriate.

Address local disk HDFS Balancing

Most clusters utilize HDFS.  This service has a local disk balancer that you can make use of.  Please find some helpful guidance within the following - Rebalance your HDFS Disks (single node)

Address local disk Kudu Balancing

If you are running Kudu within your cluster, you will need to rebalance the existing Kudu data on the local disks of the worker node.  Please find some helpful guidance within the following - Rebalance your Kudu Disks (single node) 

1,001 Views
0 Kudos