Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Does Apache Phoenix or Drill support Binary Avro file format?

avatar
Expert Contributor

We have HBase tables where the data is in in Binary Avro format. To query the HBase tables easily, everytime we are creating Hive Tables and then query it, which is a tedious process as the tables are taking a long time for creation and also AdHoc tasks goes for a toss. As Phoenix or Drill can be a best alternative to Hive, a question arouse in me, whether they will support the Avro file format.

Will Phoenix or Drill make it in my case?

1 ACCEPTED SOLUTION

avatar
Super Guru
@Alex Raj

I am a little confused by the following statement:

"We have HBase tables where the data is in in Binary Avro format."

HBase stores data in HFiles and it's HBase's own format and not Avro. May be what you mean is you are exporting data from HBase into Avro and using Hive to read that data. If this is true, you can continue to do that as there are some advantages to this approach but if you want to keep data in HBase without moving it, then you can simply use Phoenix on top of HBase to read that data without moving it. In fact you can use Hive to read data in HBase. It's slow compared to Phoenix but it will do the job. May be that's what you are doing right now.

On the other hand, if you want to use Phoenix on top of HBase, you can read HBase tables from Phoenix using SQL. Again, you don't have to export data. Here is a link to quick start Phoenix.

The point is Avro doesn't come into play here and it's a little confusing why you are asking for Avro format. Between Phoenix and Drill, I would recommend using Phoenix because it's solely created for HBase and has better features and support compared to Drill.

View solution in original post

1 REPLY 1

avatar
Super Guru
@Alex Raj

I am a little confused by the following statement:

"We have HBase tables where the data is in in Binary Avro format."

HBase stores data in HFiles and it's HBase's own format and not Avro. May be what you mean is you are exporting data from HBase into Avro and using Hive to read that data. If this is true, you can continue to do that as there are some advantages to this approach but if you want to keep data in HBase without moving it, then you can simply use Phoenix on top of HBase to read that data without moving it. In fact you can use Hive to read data in HBase. It's slow compared to Phoenix but it will do the job. May be that's what you are doing right now.

On the other hand, if you want to use Phoenix on top of HBase, you can read HBase tables from Phoenix using SQL. Again, you don't have to export data. Here is a link to quick start Phoenix.

The point is Avro doesn't come into play here and it's a little confusing why you are asking for Avro format. Between Phoenix and Drill, I would recommend using Phoenix because it's solely created for HBase and has better features and support compared to Drill.