08-17-2016 07:44 AM
08-18-2016 12:00 PM
First, I want to be clear on the semantics of Kudu's writes. We aren't "eventually consistent" in the way that the term is usually used when describing systems such as Cassandra or Riak. Kudu uses a strict consensus algorithm (Raft) which 100% guarantees that all replicas will perform the same write operations in the exact same order, and therefore always have identical timelines and result in identical database states. Of course, one replica (of three) may be lagging behind in the timeline at any point, but divergent writes are not possible.
When you see consistency aberrations in Impala, it's due to the fact that the Impala integration is currently configured with a loose _read_ consistency. In other words, Impala scanners may read from replicas which are lagging, and thus give unexpected results. This improves data locality (and thus throughput), because Impala has more flexibility of which replica it can read from for a given tablet. But, the consistency may not be satisfactory for some applications.
Given that, we plan to improve this in the future by having systems like Impala verify that the replica they are reading from is up-to-date to a given snapshot timestamp before performing the read. There are a number of small issues that need to be ironed out with this, which we will address in upcoming releases.
You can read more detail about Kudu's consistency model and known aberrations here: http://kudu.apache.org/docs/transaction_semantics.html
Hope that helps
08-18-2016 04:46 PM
Thanks a lot for the reply Todd. Looking forward to the future releases, would definitely help in delegating the check for up-to date read to the system itself.