Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS sizing and the right model

Solved Go to solution
Highlighted

HDFS sizing and the right model

Explorer

Good day guys, im newby in Cloudera and wanted to ask 2 questions.

 

1) I got 20TB of data and i should migrate it to 10 servers, do i need to have 20TB of disk on each server ?

2) How do i organize the right HDFS model (NameNode, DataNode, SecondaryNameNone) on those 10 servers ?

 

Thanks, i hope to receive the answer very soon )

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: HDFS sizing and the right model

Expert Contributor

Hi @Adilm,

 

You are right. There are not any table, you must to study your scenario(HA, security, access number ...).

 

Some questions:

    - Volume users?

    - Volume data?

 

All documentation is available here, according your version:

  https://www.cloudera.com/documentation/enterprise/latest.html

    

 

 

Regards,

Manu.

View solution in original post

4 REPLIES 4
Highlighted

Re: HDFS sizing and the right model

Expert Contributor

Hi @Adilm,

 

 1) If you want to migrate all data, you can compress them and allocated in other nodes/servers. And not need 20TB of disk.

     Althow if you need availble the data information, yo have 2 scenarios: 

         - Ten replication factor: then need 20TB per server.

         - One replication factor: only need 20TB distributed in 10 servers.

         - Best: replication factor 5 and 4TB per server.

 

2) Its depends, you need one namenode, one secondarynamenode, and for example 8 datanodes. You need to put attention of resources of your hosts.

 

 

 

Regards,

Manu.

 

 

 

Re: HDFS sizing and the right model

Explorer

Thanks for your reply, so if i get it the right way, size on each server depends on replication factor i put, is there any table of dependencies of replication factor and disk sizing ?

 

Also wanted to ask about the resources on each node, so summary i need some documentation about replica factor, sizing and ram usage.

 

 

Highlighted

Re: HDFS sizing and the right model

Expert Contributor

Hi @Adilm,

 

You are right. There are not any table, you must to study your scenario(HA, security, access number ...).

 

Some questions:

    - Volume users?

    - Volume data?

 

All documentation is available here, according your version:

  https://www.cloudera.com/documentation/enterprise/latest.html

    

 

 

Regards,

Manu.

View solution in original post

Highlighted

Re: HDFS sizing and the right model

Explorer
Thanks )
Don't have an account?
Coming from Hortonworks? Activate your account here