I am loading data from an hbase table using spark.
I load the data into a spark dataframe and then write into parquet files and then creating a hive table pointing to those parquet files.
Now , when i check the count in Hive , its coming different from the one i originally saw in hbase shell for that table.
Infact the spark dataframe itself is containing a different count it seems.
Also , the count difference pattern is not understandable , sometimes i am getting the larger count in HIve and sometimes in Hbase , so there is no consistency in his behaviour also.
I am unable to find the cause of this behaviour . Can somebody help please?