Created on 12-20-2017 07:44 AM - edited 09-16-2022 05:39 AM
Hi All,
The HBase shell count command run on a table is giving a different count than the custom map reduce row counter job ran on snapshot of same table.
This is a custom row counter written by us which is similar to HBase default map reduce row counter, the only difference is that our job first creates the snapshot and then run the map reduce task on snapshot and not on directly table, after finishing the count on snapshot, it deletes the snapshot programmatically.
Snapshot creation is programmatic done through the HBaseAdmin.snapshot() API.
When we analyzed the map reduce job log, we understood that it is gathering extra regions while creating the snapshot.
From master web UI (also hbase hbck) the number of regions allocated for the tables TABLE1 and TABLE2 are 10 and 4 respectively, however in the snapshot creation logs it is showing 18 and 7 regions respectively.
The same Map Reduce job ran on same tables in prod giving a correct count which is matching with the HBase shell count command. Only in cob we are observing this erroneous behavior.
There is a one way replication enabled from prod to cob.
Also there is no filter condition added in scanner in map reduce job, it is plain select count(*) kind of query.
The command used for running the custom the map reduce counter job is –
java -cp ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${HBASE_CONF_DIR} com.XXX.XXX.XXX.RowCounterCustom -libjars ${LIBJAR} TABLE1
We have observed that there are 7 .regioninfo files for table TABLE1 whereas 4 regions are only reported by hbck command. And suspect that this might be the root cause of issue. When it creates the snapshot it is gathering information from 7 regions instead of 4 regions and that might be the reason it is counting extra rows.
This issue is non producible on lower envs.
We want to know is there any bug reported in HBase where regions count not matching between number of regions reported by hbck command and number of actual .regioninfo files of table on hdfs.
Please let us know what could be the reason of not processing of appropriate number of regions leading to incorrect row count in COB.
--------------------------------------------------------------------------
Please see below for ref the description of TABLE1 –
hbase(main):001:0> desc 'TABLE1'
Table TABLE1 is ENABLED
TABLE1, {TABLE_ATTRIBUTES => {coprocessor$1 => '|org.apache.phoenix.coprocessor.ScanRegionObserver|805306366|', coprocessor$2 => '|org.apache.
phoenix.coprocessor.UngroupedAggregateRegionObserver|805306366|', coprocessor$3 => '|org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver|805306366|
', coprocessor$4 => '|org.apache.phoenix.coprocessor.ServerCachingEndpointImpl|805306366|', coprocessor$5 => '|org.apache.phoenix.hbase.index.Indexer|8053063
66|index.builder=org.apache.phoenix.index.PhoenixIndexBuilder,org.apache.hadoop.hbase.index.codec.class=org.apache.phoenix.index.PhoenixIndexCodec', coproces
sor$6 => '|org.apache.hadoop.hbase.regionserver.LocalIndexSplitter|805306366|'}
COLUMN FAMILIES DESCRIPTION
{NAME => '0', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '1', COMPRESSION => 'NONE', VERSIONS => '2', TTL => 'FOREVER', M
IN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row in 0.2790 seconds
--------------------------------------------------------------------------
Created 11-09-2021 04:49 AM
Hi sayhichand , did you fount out the root cause of the issue. Can you please share your findings?
Created 11-09-2021 11:26 PM
@vishal6196, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
Regards,
Vidya Sargur,