Support Questions
Find answers, ask questions, and share your expertise

CDH 6.3.0 HBase Regionserver WALs logs are not achived when Failed to schedule flush

Solved Go to solution

CDH 6.3.0 HBase Regionserver WALs logs are not achived when Failed to schedule flush

New Contributor

I am using CDH 6.3.0 HBase. Several regionserver's WALs logs directory(/hbase/wals) keep growing and logs are not moved to /hbase/oldWals for deletion.

 

When digging the regionserver logs, i found the following logs, it happens when there are heavy write load.

 

2021-01-27 14:49:16,370 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of region_id_xxxxxxxx, region=null, requester=null

 

I suspect that we come across HBASE-16721(Concurrency issue in WAL unflushed seqId tracking), however CDH 5.8.x has already fix this issue, so how come we meet this in 6.3.0.

 

HBase regionserver WAL relevant config:

 

hbase.regionserver.maxlogs: 64
heap size: 36G
wal blocksize: 256M

 

so if wal logs not archived for Concurrency issue, how to fix it; or are there any other reasons that could cause the phenomenon


Any help would be appreciated !!

@gsharma @elserj 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: CDH 6.3.0 HBase Regionserver WALs logs are not achived when Failed to schedule flush

New Contributor

After continuous monitoring the regionserver log, we found region forced to flush are all relevant to Phoenix tables. so it leads us  to the Phoenix issue PHOENIX-5250.

As our Phoenix relevant business are okay for occasional data loss, so we turn off phoenix table's WAL 

through alter table xxx set DISABLE_WAL = true and then restart those regionservers who have problems flushing problematic edits(if they are too many WALs, try just keep recent ones or HMaster will try to replay lots of WAL), currently WAL logs are archived as expected.

Of course, if DISABLE_WAL = true is not your option, then you should try to upgrade Phoenix.

Best regards.

View solution in original post

1 REPLY 1

Re: CDH 6.3.0 HBase Regionserver WALs logs are not achived when Failed to schedule flush

New Contributor

After continuous monitoring the regionserver log, we found region forced to flush are all relevant to Phoenix tables. so it leads us  to the Phoenix issue PHOENIX-5250.

As our Phoenix relevant business are okay for occasional data loss, so we turn off phoenix table's WAL 

through alter table xxx set DISABLE_WAL = true and then restart those regionservers who have problems flushing problematic edits(if they are too many WALs, try just keep recent ones or HMaster will try to replay lots of WAL), currently WAL logs are archived as expected.

Of course, if DISABLE_WAL = true is not your option, then you should try to upgrade Phoenix.

Best regards.

View solution in original post