I was using Hive-testbench (at https://github.com/hortonworks/hive-testbench) to generate tpch data sets, i started to generate a dataset of 10 gb to hive (./tpch-build.sh 10). Making a select count(*) on the generated hive table "part" it gives a total of 2000000 rows. But meanwhile i decide to download the official tpch_tool 2.17 and generate the 10 gb .tbl files and then build a hive database. For the same data size 10 gb using the newly generated table with the .tbl files the same count query gives a total of 86586082.
How is this possible, the number of rows show be the same. Can anyone give an idea of whats going on?