Support Questions
Find answers, ask questions, and share your expertise

What's the status of Phoenix? Should I use Phoenix instead of creating Hive table on top of HBase data?

What's the status of Phoenix? Should I use Phoenix instead of creating Hive table on top of HBase data?

Explorer

Can someone list pros and cons of Phoenix?

2 REPLIES 2

Re: What's the status of Phoenix? Should I use Phoenix instead of creating Hive table on top of HBase data?

There is an HBaseStorageHandler for Hive that lets you access HBase tables from Hive; however, tests I have conducted have shown this to perform very poorly when running aggregations on a large table with hundreds of million records.

Phoenix, on the other hand, is very fast at quickly aggregating the same data from HBase. I typically build a Phoenix view atop the HBase table.

There is also a Phoenix Storage Handler for Hive, but I have never tested this and cannot comment on its performance. 


Phoenix was built to provide a SQL interface to HBase, whereas Hive’s origins are in a creating schema for data in HDFS. Phoenix is an active project with recent commit activity (https://github.com/apache/phoenix)

Re: What's the status of Phoenix? Should I use Phoenix instead of creating Hive table on top of HBase data?

Super Guru

To add onto what binu as said. Phoenix in general is faster then native hbase calls due to the way phoenix compacts the bytes on hbase, therefore giving you better performance. You got a ton of benefits with phoneix over hbase and that should be in general you default way to access hbase IMHO.