Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive and Hbase : Which is best in terms of storage

avatar
Expert Contributor

As both data resides on hdfs . So which is best in terms of Storage / Memory etc.

1 ACCEPTED SOLUTION

avatar
Super Guru

Apache Hive and Apache HBase are fundamentally different systems with completely different architectures. As such, which is most efficient really depends on the application use cases. It's impossible to generically state that Hive or HBase is better/worse than the other and the fact that they both use HDFS for storing data is irrelevant.

Please quantify the application requirements you have if you'd like an answer about whether Hive or HBase are better for you.

View solution in original post

2 REPLIES 2

avatar
Super Guru

Apache Hive and Apache HBase are fundamentally different systems with completely different architectures. As such, which is most efficient really depends on the application use cases. It's impossible to generically state that Hive or HBase is better/worse than the other and the fact that they both use HDFS for storing data is irrelevant.

Please quantify the application requirements you have if you'd like an answer about whether Hive or HBase are better for you.

avatar
Master Guru

I completely agree with josh. hive is best suite for EDW type of querying. HBase is a key value store, so you need to know your questions prior to designing the PDM. Both use HDFS as the underlying storage. If you which queries will be run and have a defined access path model, Phoenix/hbase will provide you lowest latency. If you are looking for general BI queries and can't define access path up front, hive is the way to go.