Member since
11-03-2014
3
Posts
0
Kudos Received
0
Solutions
11-12-2014
06:45 AM
I agree with Clint, Bulk Loading into HBase every 3 minutes is too often and will cause a ton of compactions. To remedy the splits you should have an overall understanding of what your data will look like 6 months - 1 year from now and pre-split the table upon creation. This should give you enough regions to load all of your data without having to split everytime. This is a best practice for puts as well. Also with regards to Bulk Loading early versions of CDH4 had some issues with sequence numbers and I would advise moving to CDH 5.1.3.
... View more