Support Questions

mike_bronson7 · ‎11-10-2017

when we start the data node on one of the workers machine we get:

ERROR datanode.DataNode (DataNode.java:secureMain(2691)) - Exception in secureMain org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 4, volumes configured: 5, volumes failed: 1, volume failures tolerated: 0

and this

WARN checker.StorageLocationChecker (StorageLocationChecker.java:check(208)) - Exception checking StorageLocation [DISK]file:/grid/sdc/hadoop/hdfs/data/ org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /xxxx/sdc/hadoop/hdfs/data

what are the steps that needs to fix it?

Michael-Bronson

jsensharma · ‎11-10-2017

@Michael Bronson

WARN checker.StorageLocationChecker (StorageLocationChecker.java:check(208)) - Exception checking 
StorageLocation [DISK]file:/grid/sdc/hadoop/hdfs/data/ org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /xxxx/sdc/hadoop/hdfs/data

The above error can occur sometimes whet the Hard Disk/Filesystem has gone bad and the filesystem is in Read-Only mode. Remounting might help. Please check for any hardware errors. Check the harddisk and remount the Volume.

Also it will be good to see "/etc/hadoop/conf/hdfs-site.xml" property "dfs.datanode.failed.volumes.tolerated" this will set the disk failure tolerance.

<property>
     <name>dfs.datanode.failed.volumes.tolerated</name>
     <value>1</value>
</property>

.

View solution in original post

mike_bronson7 · ‎11-10-2017

hi Aditya , on each worker machine we have 5 volumes , and we not want to stay with 4 volume on the problematic workers , so about option 2 we not want to remove the volume , second what is the meaning to set the dfs.datanode.failed.volumes.tolerated to 1 ? after HDFS restart - it will fix the problem ?

Michael-Bronson

asirna · ‎11-10-2017

@Michael Bronson,

If you set dfs.datanode.failed.volumes.tolerated to 'x', it will allow maximum of 'x' no of volumes to be failed. So HDFS restart should fix it.

mike_bronson7 · ‎11-10-2017

another remark if I set this value to 1 it mean that HDFS will start up in spite the volume is bad ? or not in use ,

Michael-Bronson

asirna · ‎11-10-2017

@Michael Bronson

Yes. It will startup inspite the volume is bad. If you dont want this to happen you might have to replace your failed volume with a new volume (ie unmount old one and mount new one)

Cloudera Community

Support Questions

cant start DataNode from ambari cluster