Member since
12-05-2018
1
Post
0
Kudos Received
0
Solutions
12-06-2018
03:05 AM
I am loading data from an hbase table using spark. I load the data into a spark dataframe and then write into parquet files and then creating a hive table pointing to those parquet files. Now , when i check the count in Hive , its coming different from the one i originally saw in hbase shell for that table. Infact the spark dataframe itself is containing a different count it seems. Also , the count difference pattern is not understandable , sometimes i am getting the larger count in HIve and sometimes in Hbase , so there is no consistency in his behaviour also. I am unable to find the cause of this behaviour . Can somebody help please?
... View more
Labels: