Created 10-03-2018 05:31 PM
I did a count(*) on a table XXX with "set hive.auto.convert.join=false;" and got "643198 rows" whereas when i did the same count(*) on the same table XXX with "set hive.auto.convert.join=true;" i get only "331815 rows". I see more than 50% of rows getting dropped and no clue why. Anybody out there to help out? Let me know if more details are required.
Created 10-04-2018 06:16 AM
what is your execution engine ..?
can you run analyse statistics command and check ..?
analyze table <table name> compute statistics
https://community.hortonworks.com/content/supportkb/49639/hiveautoconvertjoin-true.html
Created 10-04-2018 12:24 PM
We are using TEZ engine.
Even after doing the analyze, rows are getting dropped.
ANALYZE TABLE XXX PARTITION(snapshot_id='TEST9') COMPUTE STATISTICS;
ANALYZE TABLE XXX PARTITION(snapshot_id='TEST9') COMPUTE STATISTICS for columns;
SET hive.auto.convert.join =false;
select count(*) from XXX where snapshot_id='TEST9'
Count - 643198
SET hive.auto.convert.join =true;
select count(*) from XXX where snapshot_id='TEST9'
Count - 331815