Support Questions
Find answers, ask questions, and share your expertise

HBase Active-Active writes collision

Super Guru

With a active-active hbase clusters where writes may be performed on both clusters for same table, how does hbase reconcile transactions with the same timestamp?


To the best of my understanding, this falls into the category of "last write wins". If you consider the act of replication as another client (really, the source HBase cluster is acting like a client to the destination HBase cluster), whichever one of you writes the update with the same timestamp last would have the value persisted.


HBase uses the timestamp field to keep track of versions of cells. Usually, if a write does not specify a timestamp explicitly from the client side, then HBase assigns the current time in milliseconds since epoch as the write timestamp and thus version. Reads usually ask only for the latest version of the cell.

Intercluster replication sends the edits to the other cluster with the timestamp / version field unchanged. If two clusters accept a write to the same cell, then each will assign independent timestamps according to the local wall clock of the server that holds the regions. Then each cluster will replicate it's own write to the other cluster. So each cluster will have both writes with two different versions corresponding to the same cell. A read coming in to either of the clusters and querying the latest version of the cell will read the cell value whose timestamp was higher. It is important to note that if both clusters have successfully replicated the data to each other, then both clusters will return the same result.