Created on 12-07-2020 03:34 AM - edited 12-07-2020 04:07 AM
Hello Team,
I am facing issue in my cloudera cluster and HDFS used space is growing and "/hbase/oldWALs" is occupied more than 50%.
I can confirm that, HBase replication is disabled and TTL is set to 1 Min.
And checked list of peers in hbase,
hbase(main):001:0> list_peers
PEER_ID CLUSTER_KEY STATE TABLE_CFS
0 row(s) in 0.2360 seconds
I dont see anything, Please help me with your comments.
Thanks & Regards,
Vinod
Created 12-08-2020 03:23 AM
Hello @kvinod
Thanks for using Cloudera Community. Your concern is HBase OldWALs on HDFS Path "/hbase/oldWALs" are occupying a lot of space. HBase Replication isn't being used & TTL is set to 1 Minute.
The HMaster Trace Logs capture the CleanerChore with verbose logging, yet I wish to check if you have tried the following 2 options:
1. Restart the HMaster Service to confirm if any issues with CleanerChore,
2. The Parameter "hbase.replication" is set correctly to False via the Steps shared under Section [1].
3. If the "/hbase/replication" has any entries. If no Replication is utilised (HBase Replication or Lily Indexer), Try removing the "/hbase/replication" ZNode & restart the HMaster Service.
- Smarak
[1] CM=> HBase=> Configuration=> Advanced=> HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
Created 12-08-2020 05:48 AM
Hello @kvinod
Thanks for the Update. The Replication ZNode being created is expected after restart.
The Checkbox concerning HBase Replication being left unchecked indicates Replication being disabled yet I have observed couple of cases wherein a CM Config wasn't passed to Service Level, causing certain unexpected behaviour. The explicit addition of the Parameter was to ensure the Service (HBase in this case) is aware of the Configuration.
Or, Master Restart (Performed via HBase Restart) may have resolved the issue, by spawning a new CleanerChoreThread. As such, the issue is likely with the HBase Service being unaware of Replication being disabled or HMaster CleanerChore Thread. By explicitly adding the HBase Replication as False & restarting the HBase Service, We covered the 2 possibilities.
- Smarak
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created 12-07-2020 10:59 PM
Can someone please help me?
That would be great and thanks in advance...!!
Created 12-08-2020 03:23 AM
Hello @kvinod
Thanks for using Cloudera Community. Your concern is HBase OldWALs on HDFS Path "/hbase/oldWALs" are occupying a lot of space. HBase Replication isn't being used & TTL is set to 1 Minute.
The HMaster Trace Logs capture the CleanerChore with verbose logging, yet I wish to check if you have tried the following 2 options:
1. Restart the HMaster Service to confirm if any issues with CleanerChore,
2. The Parameter "hbase.replication" is set correctly to False via the Steps shared under Section [1].
3. If the "/hbase/replication" has any entries. If no Replication is utilised (HBase Replication or Lily Indexer), Try removing the "/hbase/replication" ZNode & restart the HMaster Service.
- Smarak
[1] CM=> HBase=> Configuration=> Advanced=> HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
Created 12-08-2020 05:29 AM
Hello @smdas
Thank you so much for your response.
Followed above steps and find below details,
ls /hbase/replication
[peers, rs]
ls /hbase/replication/peers
[]
Now i have deleted replication in ZNODE and added hbase.replication as false in hbase-site.xml and restarted HBase.
Find below details after restart,
ls /hbase/replication
[peers, rs]
ls /hbase/replication/peers
[]
And now i can see it is cleared and empty under /hbase/oldWALs directory in hdfs.
But if you can observe attached screen shot, That is not enabled right?
Any differences?
Regards,
Vinod
Created 12-08-2020 05:48 AM
Hello @kvinod
Thanks for the Update. The Replication ZNode being created is expected after restart.
The Checkbox concerning HBase Replication being left unchecked indicates Replication being disabled yet I have observed couple of cases wherein a CM Config wasn't passed to Service Level, causing certain unexpected behaviour. The explicit addition of the Parameter was to ensure the Service (HBase in this case) is aware of the Configuration.
Or, Master Restart (Performed via HBase Restart) may have resolved the issue, by spawning a new CleanerChoreThread. As such, the issue is likely with the HBase Service being unaware of Replication being disabled or HMaster CleanerChore Thread. By explicitly adding the HBase Replication as False & restarting the HBase Service, We covered the 2 possibilities.
- Smarak
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created on 12-08-2020 06:36 AM - edited 12-08-2020 06:51 AM
Thank you so much and it is resolved.
But why it is not replicating ?
When it is keeping under /oldWALs directory it should replicate right ?
Can you please give me clarification.
Best Regards,
Vinod
Created 12-08-2020 09:12 AM
Hello @kvinod
As Cluster Replication wasn't being used based on the fact that "list_peers" isn't showing any Peer, It's likely the CleanerChore Thread wasn't performing its duties. Note that WALs are moved to oldWALs once the Last SequenceIDs of the WALs have been persisted to Disk via MemStore Flush. In other words, oldWALs being present doesn't necessarily means that the WALs are being persisted for replication. Now, the Cleanup of oldWALs is CleanerChore Thread responsibility. As we covered above, the HBase Service Restart covered the HMaster Restart, which would ensure the CleanerChore Thread is spawned afresh.
Let me know if the above answers your queries.
- Smarak
Created 12-08-2020 11:18 PM