Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

OldWALs are not deleting

avatar
Rising Star

Hello Team,

 

I am facing issue in my cloudera cluster and HDFS used space is growing and "/hbase/oldWALs" is occupied more than 50%.

 

I can confirm that, HBase replication is disabled and TTL is set to 1 Min.

hbase master logcleaner ttl = 1 Min
hbase replication = false
 
HBase logs i can see below warn's,
WARN org apache hadoop hbase master cleaner CleanerChore: A file cleanerhostname 60000.oldLogCleaner is stopped, won't delete any more files in /nameservice1/hbase/oldWALs

 

And checked list of peers in hbase,

hbase(main):001:0> list_peers
PEER_ID CLUSTER_KEY STATE TABLE_CFS
0 row(s) in 0.2360 seconds

 

I dont see anything, Please help me with your comments.

 

 

Thanks & Regards,

Vinod

2 ACCEPTED SOLUTIONS

avatar
Super Collaborator

Hello @kvinod 

 

Thanks for using Cloudera Community. Your concern is HBase OldWALs on HDFS Path "/hbase/oldWALs" are occupying a lot of space. HBase Replication isn't being used & TTL is set to 1 Minute. 

 

The HMaster Trace Logs capture the CleanerChore with verbose logging, yet I wish to check if you have tried the following 2 options:

1. Restart the HMaster Service to confirm if any issues with CleanerChore,

2. The Parameter "hbase.replication" is set correctly to False via the Steps shared under Section [1].

3. If the "/hbase/replication" has any entries. If no Replication is utilised (HBase Replication or Lily Indexer), Try removing the "/hbase/replication" ZNode & restart the HMaster Service.

 

- Smarak

 

[1] CM=> HBase=> Configuration=> Advanced=> HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml

View solution in original post

avatar
Super Collaborator

Hello @kvinod 

 

Thanks for the Update. The Replication ZNode being created is expected after restart.

 

The Checkbox concerning HBase Replication being left unchecked indicates Replication being disabled yet I have observed couple of cases wherein a CM Config wasn't passed to Service Level, causing certain unexpected behaviour. The explicit addition of the Parameter was to ensure the Service (HBase in this case) is aware of the Configuration.

 

Or, Master Restart (Performed via HBase Restart) may have resolved the issue, by spawning a new CleanerChoreThread. As such, the issue is likely with the HBase Service being unaware of Replication being disabled or HMaster CleanerChore Thread. By explicitly adding the HBase Replication as False & restarting the HBase Service, We covered the 2 possibilities. 

 

- Smarak

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

7 REPLIES 7

avatar
Rising Star

Can someone please help me?

That would be great and thanks in advance...!!

avatar
Super Collaborator

Hello @kvinod 

 

Thanks for using Cloudera Community. Your concern is HBase OldWALs on HDFS Path "/hbase/oldWALs" are occupying a lot of space. HBase Replication isn't being used & TTL is set to 1 Minute. 

 

The HMaster Trace Logs capture the CleanerChore with verbose logging, yet I wish to check if you have tried the following 2 options:

1. Restart the HMaster Service to confirm if any issues with CleanerChore,

2. The Parameter "hbase.replication" is set correctly to False via the Steps shared under Section [1].

3. If the "/hbase/replication" has any entries. If no Replication is utilised (HBase Replication or Lily Indexer), Try removing the "/hbase/replication" ZNode & restart the HMaster Service.

 

- Smarak

 

[1] CM=> HBase=> Configuration=> Advanced=> HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml

avatar
Rising Star

Hello @smdas 

 

Thank you so much for your response.

Followed above steps and find below details,

 

ls /hbase/replication
[peers, rs]

ls /hbase/replication/peers
[]


Now i have deleted replication in ZNODE and added hbase.replication as false in hbase-site.xml and restarted HBase.

Find below details after restart,

ls /hbase/replication
[peers, rs]

ls /hbase/replication/peers
[]

 

And now i can see it is cleared and empty under /hbase/oldWALs directory in hdfs.

 

But if you can observe attached screen shot, That is not enabled right?
Any differences?

 

HBase_replication.PNG

 

Regards,

Vinod

 

avatar
Super Collaborator

Hello @kvinod 

 

Thanks for the Update. The Replication ZNode being created is expected after restart.

 

The Checkbox concerning HBase Replication being left unchecked indicates Replication being disabled yet I have observed couple of cases wherein a CM Config wasn't passed to Service Level, causing certain unexpected behaviour. The explicit addition of the Parameter was to ensure the Service (HBase in this case) is aware of the Configuration.

 

Or, Master Restart (Performed via HBase Restart) may have resolved the issue, by spawning a new CleanerChoreThread. As such, the issue is likely with the HBase Service being unaware of Replication being disabled or HMaster CleanerChore Thread. By explicitly adding the HBase Replication as False & restarting the HBase Service, We covered the 2 possibilities. 

 

- Smarak

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Rising Star

Thank you so much and it is resolved.

But why it is not replicating ?

When it is keeping under /oldWALs directory it should replicate right ?

Can you please give me clarification.

 

 

Best Regards,

Vinod

avatar
Super Collaborator

Hello @kvinod 

 

As Cluster Replication wasn't being used based on the fact that "list_peers" isn't showing any Peer, It's likely the CleanerChore Thread wasn't performing its duties. Note that WALs are moved to oldWALs once the Last SequenceIDs of the WALs have been persisted to Disk via MemStore Flush. In other words, oldWALs being present doesn't necessarily means that the WALs are being persisted for replication. Now, the Cleanup of oldWALs is CleanerChore Thread responsibility. As we covered above, the HBase Service Restart covered the HMaster Restart, which would ensure the CleanerChore Thread is spawned afresh. 

 

Let me know if the above answers your queries.

 

- Smarak

avatar
Rising Star

Thanks @smdas for clarification.

 

Best Regards,

Vinod