04-18-2017 07:03 PM
I am trying to understand how Kudu merges the base file with delta files. I found this in a blog:
For each DiskRowSet, the scanner will materialise a column at a time, and apply any delta records and predicates before moving on to the next column.
Could some one point me to the code that does this?
Thanks in advance.
04-18-2017 08:30 PM
Most, of the code that does this can be found in the "tablet" module.
Iterators for all rowsets are assembled in Tablet::CaptureConsistentIterators() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet.cc#L1572)
Then, for diskrowsets, the base data iterator is wrapped in a delta iterator. This process starts at DiskRowSet::NewRowIterator() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/diskrowset.cc#L592)
To get more insight on how the base is materialized and deltas are applied you can follow the different iterators that are created in DiskRowSet::NewRowIterator().