We a group of people trying to understand the architecture of
erasure coding in Hadoop 3.0. We have been facing difficulties to
understand few terms and concepts regarding the same.
What do the terms Block, Block Group, Stripe, Cell and Chunk mean in
the context of erasure coding (these terms have taken different meanings
and have been used interchangeably over various documentation and
blogs)? How has this been incorporated in reading and writing of EC
2. How has been the idea/concept of the block from previous versions carried over to EC?
The higher level APIs, that of ErasureCoders and ErasureCodec still
hasn't been plugged into Hadoop. Also, I haven't found any new Jira
regarding the same. Can I know if there are any updates or pointers
regarding the incorporation of these APIs into Hadoop?
4. How is the datanode for reconstruction work chosen? Also, how are the buffer sizes for the reconstruction work determined?
Thanks in advance for your time and considerations.