04-18-2017 07:03 PM
I am trying to understand how Kudu merges the base file with delta files. I found this in a blog:
For each DiskRowSet, the scanner will materialise a column at a time, and apply any delta records and predicates before moving on to the next column.
Could some one point me to the code that does this?
Thanks in advance.
Solved! Go to Solution.
04-18-2017 08:30 PM
Most, of the code that does this can be found in the "tablet" module.
Iterators for all rowsets are assembled in Tablet::CaptureConsistentIterators() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet.cc#L1572)
Then, for diskrowsets, the base data iterator is wrapped in a delta iterator. This process starts at DiskRowSet::NewRowIterator() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/diskrowset.cc#L592)
To get more insight on how the base is materialized and deltas are applied you can follow the different iterators that are created in DiskRowSet::NewRowIterator().