Since your question revolves around attempting to read the files when writing it, the HDFS reading-while-writing (tail) semantics are explained at https://www.aosabook.org/en/hdfs.html
(section 8.3.1, especially the notes about the hflush operation) and is worth going over.
In Flume's writer model, all events read from the channel are immediately serialized and placed into the configured output writers. When this writer is a HDFS type, this would mean all data goes to the DataNode replica pipeline in chunks of 64k or 128k (io.file.buffer.size config) local buffer flushes. These bytes will be in the HDFS file but they will not be normally visible until the file is closed or the next block is created (at the configured dfs.blocksize boundary).
Occasionally, Flume will also call the HDFS writer's hflush() call, whose function is to update the readable length of the file to whatever is the currently written bytes. Flume does this when the # of events touch the configured batch size (default: 100 events).
So you could expect that if you run a 'hdfs fs -tail -f' on a Flume-written open file, you will see data come out in broken periods of 100 or so events each, instead of freely flowing like you would expect on a local filesystem with a 4 KiB block/page size. This would also mean that if you just had 99 more entries written to the file at its end, and the rolling interval/size has not yet been reached, then you will only see the last written entries when the roll of file is performed (i.e. its closed and renamed to lose the .tmp).
What's the difference in size between the 100 KiB mark (as you've observed it stop at, per file) and the final size when the file is rolled by Flume? Is it enough to account for just upto 100 more new entries?