Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hbase regionserver goes down b/c it could not obtain block, can't find a file. I think this could be due to compaction.

avatar
Explorer
2019-01-08 16:22:29,475 WARN  [MemStoreFlusher.0] impl.BlockReaderFactory: I/O error constructing remote block reader.
java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/<ip>:51076 remote=/<ip>:50010]


2019-01-08 16:22:29,477 WARN  [MemStoreFlusher.0] hdfs.DFSClient: Failed to connect to /<ip>:50010 for file /apps/hbase/data/data/default/EDA_ATTACHMENTS/376661f95c7be7f667a876480e732976/.tmp/DATA/92e54a03a44042a1be63a7ff04158792 for block BP-869721575-<ip>-1543446665241:blk_1073772872_32065, add to deadNodes and continue.


2019-01-08 16:31:06,275 ERROR [regionserver/hadoop-2:16020-shortCompactions-1546916652740] regionserver.CompactSplit: Compaction failed region=EDA_ATTACHMENTS,,1546990167772.005c417fdc141d22d49c63fe93014aa8., storeName=DATA, priority=96, startTime=1546990170152
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://hadoop-1.nit.disa.mil:8020/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/3e9c1942fe484a26a81ba5a2578a69d5
        at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:545)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:579)
        at org.apache.hadoop.hbase.regionserver.StoreFileReader.<init>(StoreFileReader.java:104)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:270)
        at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:357)
        at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:465)
        at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:683)
        at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:676)
        at org.apache.hadoop.hbase.regionserver.HStore.validateStoreFile(HStore.java:1858)
        at org.apache.hadoop.hbase.regionserver.HStore.moveFileIntoPlace(HStore.java:1431)
        at org.apache.hadoop.hbase.regionserver.HStore.moveCompactedFilesIntoPlace(HStore.java:1419)
        at org.apache.hadoop.hbase.regionserver.HStore.doCompaction(HStore.java:1387)
		        at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2095)
        at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:592)
        at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:634)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-869721575-<ip>-1543446665241:blk_1073772893_32086 file=/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/3e9c1942fe484a26a81ba5a2578a69d5
        at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:870)
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:853)
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:832)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:564)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:754)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:820)
		at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:401)
        at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:532)
4 REPLIES 4

avatar
Explorer

I'm getting errors in the regionserver logs about could not obtain block. Above is an example but there are different errors related to the same could not locate block or file. The file is in hdfs. The regionservers can't recover and eventually crash. When restarting the regionserver they will try to locate the block and can't and continue to go down.

avatar
Explorer

I see these "could not obtain block" errors in the log. Here's another one. The block is ok per hdfs dfs <path> -files -blocks. This makes me think hbase can't read the file b/c it can't read the hfile trailer. Need to figure out how to verify/validate and repair the hfile.

2019-01-14 09:58:00,575 ERROR [regionserver/hadoop-2:16020-shortCompactions-1547430414622] regionserver.CompactSplit: Compaction failed region=EDA_ATTACHMENTS,,1546990167772.005c417fdc141d22d49c63fe93014aa8., storeName=DATA, priority=96, startTime=1547484994585
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://hadoop-1.nit.disa.mil:8020/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a
        at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:545)
        at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:579)
        at org.apache.hadoop.hbase.regionserver.StoreFileReader.<init>(StoreFileReader.java:104)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:270)
        at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:357)
        at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:465)
        at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:683)
        at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:676)
        at org.apache.hadoop.hbase.regionserver.HStore.validateStoreFile(HStore.java:1858)
        at org.apache.hadoop.hbase.regionserver.HStore.moveFileIntoPlace(HStore.java:1431)
        at org.apache.hadoop.hbase.regionserver.HStore.moveCompactedFilesIntoPlace(HStore.java:1419)
        at org.apache.hadoop.hbase.regionserver.HStore.doCompaction(HStore.java:1387)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1375)
        at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2095)
        at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:592)
        at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:634)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-869721575-207.132.83.245-1543446665241:blk_1073784662_43855 file=/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a
        at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:870)
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:853)
___


This shows the block is ok.
hdfs/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a -files -blocks


/apps/hbase/data/data/default/EDA_ATTACHMENTS/005c417fdc141d22d49c63fe93014aa8/.tmp/DATA/84c4ac0eb34048f88b2c6267eb4b0f1a 85047198 bytes, replicated: replication=2, 1 block(s):  OK
0. BP-869721575-207.132.83.245-1543446665241:blk_1073784662_43855 len=85047198 Live_repl=2
___

avatar
Explorer

This shows no corrupt hfiles.

./hbase hbck -checkCorruptHFiles Checked 117 hfile for corruption HFiles corrupted: 0 HFiles moved while checking: 0 Mob files moved while checking: 0 Summary: OK Mob summary: OK

avatar
Explorer

The problem appears to have been caused by the virus scanning software running.