Created on 11-28-2016 07:13 AM - edited 09-16-2022 03:49 AM
Hi,
At some point "test_indexer" stopped indexing new entries from the hbase.
The table has the replication scope set to 1. From the logs, the "test_indexer" and the solr server appear to be working just fine.
Looking at the RegionServer logs I see the following warning being repeated all day:
WARN org.apache.hadoop.hbase.replication.regionserver.ReplicationSource Indexer_test_indexer Got: java.io.EOFException: hdfs://tets-node:8020/hbase/oldWALstest-node%2C60020%2C1480189429512.null0.1480189450431 not a SequenceFile at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1919) at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1827) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1841) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:70) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:168) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:177) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:66) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:302) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:267) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:255) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:406) at org.apache.hadoop.hbase.replication.regionserver.ReplicationWALReaderManager.openReader(ReplicationWALReaderManager.java:70) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.openReader(ReplicationSource.java:745) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.run(ReplicationSource.java:541)
Folder /hbase/oldWALs is taking up 8.7Gb and slowly but noticeably increasing.
So my guess is the indexer is not marking the WAL entries as replicated and they keep accumulating in the oldWALs folder. How do I determine the cause of the issue?
Could you also explain to me what this exception means? The set-up worked just fine a day ago.
Thanks,
Gin
Created 12-09-2016 08:36 AM
Yes, the 0b-sized old WALs were the culprit.
The hbase-indexer could not get past the corrupt files no matter what. Removing them was not an option as the hbase-indexer was expecting for these files.
I did a dirty trick and copy-pasted the contents from other old WALs.The source WALs, judging from the size and contents, looked like empty carriers.
That did the trick and hbase-indexer finally consumed all of the old WALs.
Created 11-28-2016 07:52 AM
Created 12-09-2016 08:36 AM
Yes, the 0b-sized old WALs were the culprit.
The hbase-indexer could not get past the corrupt files no matter what. Removing them was not an option as the hbase-indexer was expecting for these files.
I did a dirty trick and copy-pasted the contents from other old WALs.The source WALs, judging from the size and contents, looked like empty carriers.
That did the trick and hbase-indexer finally consumed all of the old WALs.
Created 12-19-2016 06:06 AM
Thx, that was interesting to know !
Created 10-04-2019 12:02 PM
I am having the same error but I didn't understand the solution, can you please explain it. Thank you.