Im trying Pivotal Hawq with ambari and now im trying to run some queries over hive tables with hawq.
From what i have seen Hawq can query hive tables through HCatalog (https://community.hortonworks.com/articles/43264/hawqhdb-and-hadoop-with-hive-and-hbase.html ), and so, i use psql tool on the comand line to run queries like this:
SELECT * FROM hcatalog.hive-db-name.hive-table-name;
Previously i run some queries on Hive to compare results with Hawq, i was expecting hawq to be much faster, but hawq its being much more slow, the query response is much more long than in Hive. The specfic query that i am trying to run is query 1 from TPCH on hive table stored as ORC. Hive took 18 seconds, running the query in psql tool with hcatalog 6 minutes and 28s.
Can someone explain why is this happening?
What version of HAWQ/HDB is it? What PXF profile are you using? You should try the new HiveORC profile in HDB 188.8.131.52, if you haven't:
Hi,hcatalog just use one segment to deal with the data.
I think you can use hawq Managed table or external table (pxf) to select your tables