- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
datanode + Directory is not writable
Created ‎04-23-2018 05:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we have ambari cluster HDP version 2.6.0.1
we have issues on worker02 according to the log - hadoop-hdfs-datanode-worker02.sys65.com.log,
2018-04-21 09:02:53,405 WARN checker.StorageLocationChecker (StorageLocationChecker.java:check(208)) - Exception checking StorageLocation [DISK]file:/grid/sdc/hadoop/hdfs/data/ org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /grid/sdc/hadoop/hdfs/data
note - from ambari GUI we can see that Data-node on worker02 is down
we can see from the log - Directory is not writable: /grid/sdc/hadoop/hdfs/data the follwing:
STARTUP_MSG: Starting DataNode STARTUP_MSG: user = hdfs STARTUP_MSG: host = worker02.sys65.com/23.87.23.126 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.7.3.2.6.0.3-8 STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r c6befa0f1e911140cc815e0bab744a6517abddae; compiled by 'jenkins' on 2017-04-01T21:32Z STARTUP_MSG: java = 1.8.0_112 ************************************************************/ 2018-04-21 09:02:52,854 INFO datanode.DataNode (LogAdapter.java:info(47)) - registered UNIX signal handlers for [TERM, HUP, INT] 2018-04-21 09:02:53,321 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for [DISK]file:/grid/sdb/hadoop/hdfs/data/ 2018-04-21 09:02:53,330 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for [DISK]file:/grid/sdc/hadoop/hdfs/data/ 2018-04-21 09:02:53,330 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for [DISK]file:/grid/sdd/hadoop/hdfs/data/ 2018-04-21 09:02:53,331 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for [DISK]file:/grid/sde/hadoop/hdfs/data/ 2018-04-21 09:02:53,331 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(107)) - Scheduling a check for [DISK]file:/grid/sdf/hadoop/hdfs/data/ 2018-04-21 09:02:53,405 WARN checker.StorageLocationChecker (StorageLocationChecker.java:check(208)) - Exception checking StorageLocation [DISK]file:/grid/sdc/hadoop/hdfs/data/ org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /grid/sdc/hadoop/hdfs/data at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:124) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:99) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:128) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:44) at org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker$1.call(ThrottledAsyncChecker.java:127) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-04-21 09:02:53,410 ERROR datanode.DataNode (DataNode.java:secureMain(2691)) - Exception in secureMain org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 4, volumes configured: 5, volumes failed: 1, volume failures tolerated: 0 at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2583) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2492) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2539) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2684) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2708) 2018-04-21 09:02:53,411 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2018-04-21 09:02:53,414 INFO datanode.DataNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at worker02.sys65.com/23.87.23.126 ************************************************************/
<br>
we checked that:
1. all files and folders under - /grid/sdc/hadoop/hdfs/ are with hdfs:hadoop , and that is OK
2. disk - sdc is read and write (rw,noatime,data=ordered) , and that is OK
we suspect that Hard Disk has gone bad , in this case how we check that?
please advice what the other options to resolve this issue ?
Created ‎04-25-2018 08:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The disk is already unusable go-ahead run fsck with a -y option to repair it 🙂 see above
Either way you will have to replace that dirty disk anyways!
Created ‎04-25-2018 07:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Any updates?
Created ‎04-25-2018 07:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Geoffrey
we just waiting for your approval about the following steps:
1. umount /grid/sdc or umount -l /grid/sdc in case devise is busy
2. fsck -y /dev/sdc
3. mount /grid/sdc
Created ‎04-25-2018 08:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Avahi is a system which facilitates service discovery on a local network via the mDNS/DNS-SD protocol suite. This enables you to plug your laptop or computer into a network and instantly be able to view other people who you can chat with, find printers to print to or find files being shared. Compatible technology is found in Apple MacOS X (branded Bonjour and sometimes Zeroconf)
The two big benefits of Avahi are name resolution & finding printers, but on a server, in a managed environment, it's of little value.
unmounting and mount filesystems are a common thing especially in Hadoop clusters, your SysOps team should have validated that, but all looks correct to me.
Do a dry run with the below code to see what will be affected that will give you a better picture.
# e2fsck -n /dev/sdc
The data will be reconstructed as you have default replication factor you can later rebalance the HDFS data
Created ‎04-25-2018 08:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes we already did it on one of the disks , see please - https://community.hortonworks.com/questions/189016/datanode-machine-worker-one-of-the-disks-have-fil...
Created ‎04-25-2018 08:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The disk is already unusable go-ahead run fsck with a -y option to repair it 🙂 see above
Either way you will have to replace that dirty disk anyways!

- « Previous
-
- 1
- 2
- Next »