Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HBase count mismatch with Hive COunt

HBase count mismatch with Hive COunt

New Contributor

I am loading data from an hbase table using spark.

I load the data into a spark dataframe and then write into parquet files and then creating a hive table pointing to those parquet files.

Now , when i check the count in Hive , its coming different from the one i originally saw in hbase shell for that table.

Infact the spark dataframe itself is containing a different count it seems.

Also , the count difference pattern is not understandable , sometimes i am getting the larger count in HIve and sometimes in Hbase , so there is no consistency in his behaviour also.

I am unable to find the cause of this behaviour . Can somebody help please?

Don't have an account?
Coming from Hortonworks? Activate your account here