Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HBASE REST store Multiple rows

HBASE REST store Multiple rows



We are using cell store multiple as defined in the HBase REST document to POST multiple rows (as JSON body) to Hbase and its working well.


The behaviour is if insert of all rows are successful we get a success message 200 even if insert of one row fails we get error and none of the rows are inserted ... this is a multi-row transaction.


Somewere else in Hbase documentation ( it is mehtioned that Hbase doesn't support transactions across multiple rows that means if you put multiple rows some might fail and some might succeed there is no guarentee that all will fail or all will succeed... but the above behaviour through REST API using Cell Store (Multiple) contradicts this statement.


Was there a patch to fix this ? am I missing something?




Re: HBASE REST store Multiple rows

Expert Contributor

I'm not intimately familiar with the HBase code, so take this thought for what it's worth,

I re-read the ACID statement because I agree, what i had always assumed was that you would get partial successes all the time.

If i interpret it correctly, the point they are making is that it "may" return back which ones succeeded and which ones failed, but I do know there has been steady effort to get more and more ACID compliant to make HBase more friendly, so it's not outside the realm of possibility that they try and avoid doing partial multi mutates if at all possible. 

If i had to guess, if your multi spanned only one regionserver, they could return an all fail, but if you hit multiple regionservers with your multi that they could not and you would get a partial sucess. The sucesses would come from the regionserver with no errors, and the failures would be all of updates for the regionserver that had the one row fail.


Re: HBASE REST store Multiple rows

Appreciate your reply, I have retested this today and like you have
mentioned if the table spanned region servers and the rowkey split is such
that in the batch insert if different rowkeys go to different region server
then some may fail and some may succeed.

However if all the rows in your batch are going to the same region server (
In our case rowkeys of all the rows share the same prefix) and split is
based on the prefix then all will succeed or all will fail.