Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is SparkSQL faster than Hive or Beeline?

avatar
Expert Contributor

My hive queries are taking long time to query the data. We have around 2 million records to which I use to supply the query and get the result after a long wait time. I was looking for an alternative and Spark came to mind at first. I was going through some hortonworks links that has illustrated to query the hive table using SparkSQL (SSQL) but that was quite generic. Here is my requirement.

I have hive tables already created and I need to query them using SSQL. How best can I do that?

I also like to create new hive tables using SSQL. Would the table be the same as hive table or different? If yes, in what ways are they gonna be different? Would I still be able to query the tables created by SSQL using Hive or Beeline?

1 ACCEPTED SOLUTION

avatar
Super Guru

These are some decks comparing spark-sql,hive on tez and hive on spark.

http://www.slideshare.net/hortonworks/hive-on-spark-is-blazing-fast-or-is-it-final

hive on spark (HIVE-7292) is still in beta phase, with earliar version of spark we have hiveContext object to query hive tables but starting with spark-1.4 you can query hive tables using sqlContext object.

View solution in original post

2 REPLIES 2

avatar
Super Guru

These are some decks comparing spark-sql,hive on tez and hive on spark.

http://www.slideshare.net/hortonworks/hive-on-spark-is-blazing-fast-or-is-it-final

hive on spark (HIVE-7292) is still in beta phase, with earliar version of spark we have hiveContext object to query hive tables but starting with spark-1.4 you can query hive tables using sqlContext object.