After continuous monitoring the regionserver log, we found region forced to flush are all relevant to Phoenix tables. so it leads us to the Phoenix issue PHOENIX-5250. As our Phoenix relevant business are okay for occasional data loss, so we turn off phoenix table's WAL through alter table xxx set DISABLE_WAL = true and then restart those regionservers who have problems flushing problematic edits(if they are too many WALs, try just keep recent ones or HMaster will try to replay lots of WAL), currently WAL logs are archived as expected. Of course, if DISABLE_WAL = true is not your option, then you should try to upgrade Phoenix. Best regards.
... View more
I am using CDH 6.3.0 HBase. Several regionserver's WALs logs directory(/hbase/wals) keep growing and logs are not moved to /hbase/oldWals for deletion. When digging the regionserver logs, i found the following logs, it happens when there are heavy write load. 2021-01-27 14:49:16,370 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of region_id_xxxxxxxx, region=null, requester=null I suspect that we come across HBASE-16721(Concurrency issue in WAL unflushed seqId tracking), however CDH 5.8.x has already fix this issue, so how come we meet this in 6.3.0. HBase regionserver WAL relevant config: hbase.regionserver.maxlogs: 64
heap size: 36G
wal blocksize: 256M so if wal logs not archived for Concurrency issue, how to fix it; or are there any other reasons that could cause the phenomenon Any help would be appreciated !! @gsharma @elserj
... View more