Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to view the actual Bad Rows in Hbase Replication?

How to view the actual Bad Rows in Hbase Replication?

New Contributor

Hi,

I've setup Hbase cross cluster replication between 2 clusters. After running some stress test by running the following command - "sudo -su hbase hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1" which will do random inserts. The count on both the cluster's table matched i.e 906856. However we have to verify if the replication is consistent on both cluster. To do that, I followed Hortonwork's document and ran the command. The output is shown below

ROWS_SCANNED=906856
RPC_CALLS=9070
RPC_RETRIES=0 org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters
BADROWS=3
CONTENT_DIFFERENT_ROWS=3
GOODROWS=906853

The number of rows scanned is correct 906856 which is total count of table. But there are 3 bad rows. Same result was given when ran in another cluster as well. With this result I can say that there is problem with Quality and not with Quantity.

The main question now is:

How to find out and view the actual 3 Bad Rows in the table ?

Regards,

Shesh Kumar

1 REPLY 1
Highlighted

Re: How to view the actual Bad Rows in Hbase Replication?

New Contributor

How to find out and view the actual 3 Bad Rows in the table ?

You can read the bad rows in Mapper logs. It will be marked as ERROR. So when you open the Mapper logs, you should page search for "ERROR" and voila! You should be able to read the bad rows!