Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.
Labels (1)
Contributor

Background

Customer attached bigger disks to expand the data node storage. If one disk physical fail, can also use this solution.

Step by Step

  • Decommission HDFS DataNode component on the host9954-picture1.png
  • It takes hours to finish, depends on your existing data size9955-picture2.png
  • DataNode is decommissioned9957-picture3.png
  • Turn the host in maintenance9958-picture4.png
  • Stop all components on the host9959-picture5.png
  • Change linux /etc/fstab mount new disks to existing mount points. If possible use uuid rather than disk device. uuid is much stable especially in cloud environment.
  • Manually create the yarn log and local folders in the mount points. Because we don't reprovision the host, yarn won't create create these dirs in your configure, but try to reuse them.
# for disk in /hadoop/disk-sd{d..j}/Hadoop
> do
> mkdir ${disk}/yarn/log && chown yarn:hadoop ${disk}/yarn/log
> mkdir ${disk}/yarn/local && chown yarn:hadoop ${disk}/yarn/local
> done

  • After change Linux disk mount configuration, start all components on the host9960-picture6.png
  • Recommission DateNode9961-picture7.png
  • Turn off maintenance9963-picture8.png
  • Check hdfs blocks
$ hdfs fsck / | egrep -v '^\.+$' | grep -v eplica

FSCK started by hdfs (auth:KERBEROS_SSL) from /192.168.141.39 for path / at Tue Nov 29 10:42:34 UTC 20161.............................................................................................Status:
HEALTHY Total size:  769817156313
B (Total open files size: 75189484 B) Total dirs:  4934 Total files:  23693 Total symlinks:  0 (Files currently being written: 30) Total blocks (validated):  27536 (avg. block size 27956753 B) (Total
open file blocks (not validated): 24) Corrupt blocks:  0 Number of data-nodes:  7 Number of racks:  1FSCK ended at Tue Nov 29
10:42:34 UTC 2016 in 433 millisecondsThe filesystem under path
'/' is HEALTHY

picture8.pngpicture3.png
3,361 Views
Comments
Contributor

Can you please tell why do we need to stop all services on the host while only DataNode and NodeManager is affected? Why do we also need to stop RegionServer and other services?

Contributor

After stopping DataNode, Hbase RegionServer still holds FDs from mounts and it needs to be stopped before you can unmount the volumes.

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 07:37 AM
Updated by:
Contributor wbu
 
Contributors
Top Kudoed Authors