Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How erasure encoding policy works

Highlighted

How erasure encoding policy works

Expert Contributor

Hi All,

I'm trying to understand How hadoop 3 store data on HDFS by erasure encoding.

As per erasure encoding, currently six built-in policies are supported:

RS-3-2-1024k,RS-6-3-1024k, RS-10-4-1024k, RS-LEGACY-6-3-1024k, XOR-2-1-1024k and REPLICATION.

Replication is general term which was also using in hadoop2(replicate the data 3x).

How Reed Solomony RS-3-2-1024k(3 data blocks, 2 parity blocks and 1024k cell size) or RS-6-3-1-24k(6 data blocks, 3 parity blocks and 1024k cell size) store the data?

Suppose we are having 3 data nodes, 2 NNs, 1 Edge node. We have to store the 1GB file(abc.txt) and Block size is 128MB. How RS-3-2-1024k, RS-6-3-1024k works?

What is meaning of 6 data blocks, 1024K?

Is there any specific prerequisites for number of DATANODE's required, according to policy?

Will appreciable in advance to help me to understand the hadoop 3 concept.

Regards,

Vinay K


2 REPLIES 2

Re: How erasure encoding policy works

New Contributor

Hello,

Have a look at the following doc link [https://blog.cloudera.com/blog/2015/09/introduction-to-hdfs-erasure-coding-in-apache-hadoop/] (specifically the section under ""Design and Implementation"")

This should help explain it further.

Re: How erasure encoding policy works

Expert Contributor

Hi @Pulkit Bhardwaj

I have gone through this link, from this majorly i understand the performance of 3-way replication vs EC.

Still i didn't understand how data is storing in HDFS.

If i have to store 1GB file in HDFS, Logically File size is divide into 1024MB/128MB = 8 blocks, So now how RS-6-3-1024k store data these 8 blocks? what is meaning of 6 data block in RS and how 3 parity will work?

Is EC further divide 8 blocks into sub-blocks?

Could anyone help me to understand the logic?