Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

CDH 6.3.0 HBase Regionserver WALs logs are not achived when Failed to schedule flush

avatar
New Contributor

I am using CDH 6.3.0 HBase. Several regionserver's WALs logs directory(/hbase/wals) keep growing and logs are not moved to /hbase/oldWals for deletion.

 

When digging the regionserver logs, i found the following logs, it happens when there are heavy write load.

 

2021-01-27 14:49:16,370 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of region_id_xxxxxxxx, region=null, requester=null

 

I suspect that we come across HBASE-16721(Concurrency issue in WAL unflushed seqId tracking), however CDH 5.8.x has already fix this issue, so how come we meet this in 6.3.0.

 

HBase regionserver WAL relevant config:

 

hbase.regionserver.maxlogs: 64
heap size: 36G
wal blocksize: 256M

 

so if wal logs not archived for Concurrency issue, how to fix it; or are there any other reasons that could cause the phenomenon


Any help would be appreciated !!

@gsharma @elserj 

 

1 ACCEPTED SOLUTION

avatar
New Contributor

After continuous monitoring the regionserver log, we found region forced to flush are all relevant to Phoenix tables. so it leads us  to the Phoenix issue PHOENIX-5250.

As our Phoenix relevant business are okay for occasional data loss, so we turn off phoenix table's WAL 

through alter table xxx set DISABLE_WAL = true and then restart those regionservers who have problems flushing problematic edits(if they are too many WALs, try just keep recent ones or HMaster will try to replay lots of WAL), currently WAL logs are archived as expected.

Of course, if DISABLE_WAL = true is not your option, then you should try to upgrade Phoenix.

Best regards.

View solution in original post

1 REPLY 1

avatar
New Contributor

After continuous monitoring the regionserver log, we found region forced to flush are all relevant to Phoenix tables. so it leads us  to the Phoenix issue PHOENIX-5250.

As our Phoenix relevant business are okay for occasional data loss, so we turn off phoenix table's WAL 

through alter table xxx set DISABLE_WAL = true and then restart those regionservers who have problems flushing problematic edits(if they are too many WALs, try just keep recent ones or HMaster will try to replay lots of WAL), currently WAL logs are archived as expected.

Of course, if DISABLE_WAL = true is not your option, then you should try to upgrade Phoenix.

Best regards.