Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Unable to start datanode - Too many failed volumes.

avatar
New Member

Hi,

After reinstalling HDP2.3, I am getting the following error when I try to restart the service.

org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 3, volumes configured: 9, volumes failed: 6, volume failures tolerated: 0 at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:289) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1412) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1364) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:224) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:821) at java.lang.Thread.run(Thread.java:745)

When I digged the data dir, some of them contains directory from prior installation. How to fix this issue.

Thanks in advance.

Regards,

Subramanian S.

1 ACCEPTED SOLUTION

avatar

Subramanian Santhanam

Chek you hdfs-site.xml for dfs.data.dir.

This is a comma-delimited list of directories. Remove what you do not need.

If this is ambari managed then change this from ambari.

HDFS -> Config -> DataNode directories

Ensure that it is configured correctly.

View solution in original post

4 REPLIES 4

avatar

Subramanian Santhanam

Chek you hdfs-site.xml for dfs.data.dir.

This is a comma-delimited list of directories. Remove what you do not need.

If this is ambari managed then change this from ambari.

HDFS -> Config -> DataNode directories

Ensure that it is configured correctly.

avatar
New Member

Hi,

The issue is fixed. When I am running the cleanup script I removed the users, so the folder permission become zombie.

I fixed them using chown. Now its working fine. Thanks @Rahul Pathak and @Kuldeep Kulkarni.

Regards,

Subramanian S.

avatar
Master Guru
@Subramanian Santhanam

Please check why you have 6 disks failures. For a workaround you can do what Rahul has suggested in his answer or you can increase value of below property to allow datanode to tolerate X number of disks failures.

dfs.datanode.failed.volumes.tolerated - By default this is set to 0 in hdfs-site.xml

avatar
New Member


<configuration>


<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>/hadoop-3.3.4/data/namenode</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>/hadoop-3.3.4/data/datanode</value>
</property>

</configuration>

 

i have this , buy yet problems