Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Cloudera Employee

It is expected in large clusters where node count ranges to few hundreds, the master services tend to be busy. One such master service is Namenode. Some of the critical activities that NN does includes,

1. Addressing client requests which includes verifying proper permissions, auth checks for HDFS resources.

2. Constant block report monitoring from all the Datanodes.

3. Updating the service and audit logs.

are to name a few.

In certain situations when there are rogue applications which tries to access multiple resources in HDFS or a data ingestion that is trying to load high data volumes, NN tends to be very busy. In such situations and cluster like these NN FSImage tends to be in $$GB. Hence, operations such as checkpointing would consume considerable bandwidth across the two Namenodes. Hence, high volume of edits sync along with loggings would cause high disk utilization which can lead to NameNode instability. Hence, it is recommended to have dedicated disks for service logs and edit logs.

We can monitor the IO on the disks using `iostat` output.

254 Views
0 Kudos
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎09-14-2017 09:28 PM
Updated by:
 
Contributors
Top Kudoed Authors