Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hbase Flush/writes not working for one of the column families in HBase table

avatar
Contributor

Hi All,

 

  • I have two Cloudera clusters and each cluster has one HBase table .
  • I first write to cluster1 and then get my data replicated to cluster2 with WAL replication.
  • Cluster1(has 3 Region servers) and cluster2(has 80 Region servers).
  • I am using 

    org.apache.hadoop.hbase.spark.HBaseContext.bulkPut to write data to Hbase.

  • My table has two column families and writes/flushes are successful for one column family in source cluster(cluster1) and the other column family is not logging any errors/warnings to Flush/write failures in region server logs.
  • One weird issue here is, my writes are reaching WAL successfully(both Column Family data) and they are replicating successfully in destination cluster(cluster2) and I can see data in both Column families in destination
  • whereas in source, I believe the writes are reaching WAL only, somehow it is not being picked up by memstore and hence not being flushed to one of the column families.(It works for one column family and not working for the other).
  • I tried to write data to that column family manually from hbase shell (put command) and it works fine without any issues.

 

 

What I have tried so far to fix this:

  • hbase hbck -details , no inconsistencies found.
  • Used hbck2 tool to fix hdfs filesystem for Hbase tables/hdfs directories
  • Dropped the table in source, exported a snapshot from destination cluster which has data for both column families and tried to rerun my batch job. Still the writes are going to one column family only, other one is not getting any write requests.
  • tried to tune HBase write performance by changing several parameters from link below. No luck.

https://community.cloudera.com/t5/Community-Articles/Tuning-Hbase-for-optimized-performance-Part-2/t...

  • Not sure if its a bug in my current Hbase version. It was working just fine for over an year now. ( Hbase version :2.1.0), CDH 6.2.1

 

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hello @nanda_bigdata 


Sharing the Solution to ensure the Post is marked Completed. From WAL Reader, We confirmed the Writes to the RegionServer WAL pertains to 1 ColumnFamily only, indicating the Writes are arriving to 1 ColumnFamily only. It was confirmed that the wrong Hbase Configuration was being used by the Application. After ensuring the correct Hbase Configuration was used by Application, the issue was Fixed.

 

- Smarak

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

Hello @nanda_bigdata 


Sharing the Solution to ensure the Post is marked Completed. From WAL Reader, We confirmed the Writes to the RegionServer WAL pertains to 1 ColumnFamily only, indicating the Writes are arriving to 1 ColumnFamily only. It was confirmed that the wrong Hbase Configuration was being used by the Application. After ensuring the correct Hbase Configuration was used by Application, the issue was Fixed.

 

- Smarak

avatar
Explorer

Hi Smarak,

 

Can you please elaborate on what changes were done from the application side to resolve the issue?

 

The memstore flush was failing for the column families for one of the table causing of region server to crash.

 

Thanks,

Vishal

avatar
Super Collaborator

Hello @vishal6196 

 

It's been a while on the Post yet as far as I recall, the App was writing to 1 CF only. In short, WAL is used for each RegionServer & subsequently, the Writes arrives at MemStore based on CF demarcation at Region Level. From WALReader, We confirmed the WAL have entries for 1 CF only, naturally indicating the MemStore of the concerned CF would be populated only. Additionally, I don't recall Crash being observed.

 

Are you facing similar concerns in your Environment. If Yes, Kindly share the following details in a New Post:

  • What's the MemStore Flush failure trace from Logs,
  • If the Problem is Persistent,
  • Whether WALReader (Link [1]) shows Writes happening on all CF of the Regions & Count of CF of the Region.

- Smarak

 

[1] https://hbase.apache.org/book.html#hlog_tool.prettyprint