What's the best way to reboot a CDH 5.3.1 node? Will decommissioning and recommissioning the node help? Or is it sufficient to just stop all services on the node, reboot it, and restart?
The nodes I need to reboot are HDFS and Yarn node, and some are HDFS namenodes - we are running in HA mode so is it safe to assume the order in which the active/standby namenodes are restarted doesn't matter?
FWIW we are doing this because there is a kernel bug in Ubuntu 12 that causes HDFS errors on our Yarn nodes:
Feb 11 13:17:11 abacus105 kernel: [8638490.380039] EXT4-fs warning (device sdd5): ext4_da_update_reserve_space:362: ino 6553807, allocated 1 with only 0 reserved metadata blocks (releasing 4 blocks with reserved 38 data blocks)
These errors occur across all yarn cache filesystems on affected hosts but not on HDFS data filesystems. The errors cause a kernel thread to dump but don't seem to be causing problems with data integrity. Kernels at version 3.13 do not seem to be affected.