Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Performance metrics phoenix bulk load vs hbase bulk load?

Solved Go to solution

Performance metrics phoenix bulk load vs hbase bulk load?

Super Guru

Are there any known performance stats between phoenix bulk load (mapreduce) vs hbase bulk load?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Performance metrics phoenix bulk load vs hbase bulk load?

@Sunile Manjee

I don't have stats, but you need to use Phoenix Bulk Load regardless, as HBase Bulk Load will not ensure consistent secondary indices, nor will it use the correct signing and byte ordering conventions that Phoenix needs.

3 REPLIES 3

Re: Performance metrics phoenix bulk load vs hbase bulk load?

@Sunile Manjee

I don't have stats, but you need to use Phoenix Bulk Load regardless, as HBase Bulk Load will not ensure consistent secondary indices, nor will it use the correct signing and byte ordering conventions that Phoenix needs.

Highlighted

Re: Performance metrics phoenix bulk load vs hbase bulk load?

@Sunile Manjee

I have never seen vs stats on these two bulk loading calls. If you have a phoenix table it would require a little bit of work to get a native Hbase schema to really look enough like a phoenix table for this comparaison to mean anything. Things like complex keys or column types come to mind. If it is just a phoenix view on an hbase table then comparaison might make more sense but you loose a lot of phoenix magic.

Overall the performance should not variate much from one to the other aside from any extra work you hide in the Phoenix table, like index,stats...

From a pure operations perspective use the bulkload best fitted to the type of your table

Re: Performance metrics phoenix bulk load vs hbase bulk load?

Super Guru

@nmaillard & @Randy Gelhausen great stuff. thank you