Support Questions

Find answers, ask questions, and share your expertise

Merging tablet base and delta files in Kudu

avatar
New Contributor

Hello,

 

I am trying to understand how Kudu merges the base file with delta files.  I found this in a blog:

 

For each DiskRowSet, the scanner will materialise a column at a time, and apply any delta records and predicates before moving on to the next column.

 

Could some one point me to the code that does this?

 

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Hi @rpaidar

 

  Most, of the code that does this can be found in the "tablet" module.

  Iterators for all rowsets are assembled in Tablet::CaptureConsistentIterators() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet.cc#L1572)

  Then, for diskrowsets, the base data iterator is wrapped in a delta iterator. This process starts at DiskRowSet::NewRowIterator() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/diskrowset.cc#L592)

  To get more insight on how the base is materialized and deltas are applied you can follow the different iterators that are created in DiskRowSet::NewRowIterator().

 

HTH

David

 

 

  

View solution in original post

1 REPLY 1

avatar
Cloudera Employee

Hi @rpaidar

 

  Most, of the code that does this can be found in the "tablet" module.

  Iterators for all rowsets are assembled in Tablet::CaptureConsistentIterators() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet.cc#L1572)

  Then, for diskrowsets, the base data iterator is wrapped in a delta iterator. This process starts at DiskRowSet::NewRowIterator() (https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/diskrowset.cc#L592)

  To get more insight on how the base is materialized and deltas are applied you can follow the different iterators that are created in DiskRowSet::NewRowIterator().

 

HTH

David