Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Which is better for performance? hive 3 or hbase?

New Contributor

I want to save events through Kafka into Hadoop ecosystem and also retrieve them with low latency. I can use hive3 or hbase. Which one should I use?

1 ACCEPTED SOLUTION

Guru
@ Vinit_Kratin,

If low response time is a must, then Hive is not an option, Impala should be faster.

HBase designed tables require a row key, which means, all data you want to retrieve will need this key to be queried against HBase. So for your event based data, if you need to use an event id to store and retrieve data, then hbase might be suitable.

I still suggest you to go through details of differences between HBase and Hive and Impala to get a better understanding. Google is your friend.

Cheers
Eric

View solution in original post

4 REPLIES 4

Guru
@Vinit_Kratin,

You can't compare Hive with HBase directly, it is really use case dependent. It depends on how you want to store you data, how you will query your data, and whether you need to delete data etc and etc.

You will need to explain a bit more on what you want to achieve before an answer can be given as to which is better, hive or hbase.

Doing a quick google search returned me below useful links:
https://community.cloudera.com/t5/Support-Questions/When-to-use-Hive-and-Hbase/m-p/206167
https://community.cloudera.com/t5/Community-Articles/The-Differences-between-Pig-Hive-and-HBase/ta-p...

https://www.dezyre.com/article/hive-vs-hbase-different-technologies-that-work-better-together/322

Cheers
Eric

New Contributor
Thanks a lot for your reply.
What I want to do is store events in the database.
These events are structured and needed to be queried multiple times.
Deletion is not required. low response time is a must.
Please let me know if I need to give you any more information.

New Contributor

@EricL 

Thanks a lot for your reply.
What I want to do is store events in the database.
These events are structured and needed to be queried multiple times.
Deletion is not required. low response time is a must.
Please let me know if I need to give you any more information.

Guru
@ Vinit_Kratin,

If low response time is a must, then Hive is not an option, Impala should be faster.

HBase designed tables require a row key, which means, all data you want to retrieve will need this key to be queried against HBase. So for your event based data, if you need to use an event id to store and retrieve data, then hbase might be suitable.

I still suggest you to go through details of differences between HBase and Hive and Impala to get a better understanding. Google is your friend.

Cheers
Eric
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.