Reply
Contributor
Posts: 31
Registered: ‎10-13-2016

HBase oldWALs are not being released

 

Hi,

 

In two clusters I am having a problem with a continuously growing oldWALs folder. My suspicion that this has something to do with HBase-indexer and HBase cluster replication, hence I am writing here.

 

Some facts:

1. The new entries/edits seem to be indexed to solr

2. The oldWALs keeps on growing as if they were not released (oldWALs takes up 90GB)

3. As expected, list_peers reports four ENABLED indexers

 

HBase-indexer logs report nothing unusual: 

 

DEBUG org.apache.zookeeper.ClientCnxn: Got ping response for sessionid: 0x15983bfe608005c after 0ms

 

 

Region server logs:

 

2017-01-11 13:58:35,840 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Failed to recover lease, attempt=4 on file=hdfs://cdh-hostname:8020/hbase/WALs/cdh-hostname,60020,1462909733915-splitting/cdh-hostname%2C60020%2C1462909733915.1462909799649 after 389468ms
2017-01-11 13:58:58,983 INFO org.apache.hadoop.hbase.replication.regionserver.Replication: Normal source for cluster Indexer_first_indexer: Total replicated edits: 9326, current progress: 
walGroup [cdh-hostname%2C60020%2C1483974167561.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/WALs/cdh-hostname,60020,1483974167561/cdh-hostname%2C60020%2C1483974167561.null0.1484140060245 at position: 94730757

Normal source for cluster Indexer_second_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1483974167561.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/WALs/cdh-hostname,60020,1483974167561/cdh-hostname%2C60020%2C1483974167561.null0.1483974265619 at position: 191225

Normal source for cluster Indexer_third_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1483974167561.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/WALs/cdh-hostname,60020,1483974167561/cdh-hostname%2C60020%2C1483974167561.null0.1483974265619 at position: 191225

Normal source for cluster Indexer_fourth_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1483974167561.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/WALs/cdh-hostname,60020,1483974167561/cdh-hostname%2C60020%2C1483974167561.null0.1483974265619 at position: 191225

Recovered source for cluster/machine(s) Indexer_third_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1478856370573.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/oldWALs/cdh-hostname%2C60020%2C1478856370573.null0.1479467681784 at position: 4200

Recovered source for cluster/machine(s) Indexer_second_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1478856370573.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/oldWALs/cdh-hostname%2C60020%2C1478856370573.null0.1481030503857 at position: 11840

Recovered source for cluster/machine(s) Indexer_fourth_indexer: Total replicated edits: 0, current progress: 
walGroup [cdh-hostname%2C60020%2C1478856370573.null0]: currently replicating from: hdfs://cdh-hostname:8020/hbase/oldWALs/cdh-hostname%2C60020%2C1478856370573.null0.1481030503857 at position: 11840

 

 

Could you please advice me on how to debug the replication-related issues? I can't afford disabling the indexers as then the data in solr would become inconsistent and I would need to reindex HBase tables from scratch.

 

Thanks,

Gin

 

Posts: 177
Topics: 8
Kudos: 27
Solutions: 19
Registered: ‎07-16-2015

Re: HBase oldWALs are not being released

[ Edited ]

Hi,

 

I think you are right to suspect the replication. It is usualy the culprit for a growing oldWALs.

That said I don't know of a way to hotfix this without deleting the peer, sorry.

 

If you have access to cloudera support I think this would be a relevant topic to discuss with them.

Contributor
Posts: 31
Registered: ‎10-13-2016

Re: HBase oldWALs are not being released

[ Edited ]

Thanks, Mathieu. Unfortunately, I do not have access to Cloudera's support.

 

I have deleted the indexers for this time.

Posts: 177
Topics: 8
Kudos: 27
Solutions: 19
Registered: ‎07-16-2015

Re: HBase oldWALs are not being released

I know we have already faced this situation once some time ago.

If we encounter it again, we will raise a support ticket.

Highlighted
New Contributor
Posts: 1
Registered: ‎08-24-2018

Re: HBase oldWALs are not being released

[ Edited ]

We actually gone through this blog


blog.spagoworld.org/2016/07/solr-on-top-of-hbase-for-dashboards/

We have follow all steps as it is and the problem we are facing is that Whenever we insert data into HBase , Then That data is not replicating into Solr.
When we further debug the things , we found that HBase-INdexer is listening to Region Server WAL Logs and When we check Region Server Logs , we are seeing that it is keep replicating from same line no i.e. 1850 and Total Replication Edits are 0 everytime. I am attaching screenshot as well.
I will really appreciate your Help .PLease let me know if you need additional info regarding this issue2018-08-24-53.png

Announcements