Support Questions

Babak · ‎01-03-2018

Is there any upper limit for maximum capacity per node? Can data nodes scale to more than 100TB/node?

Harsh J · ‎03-17-2018

There are no limits in the source code implementation, if that is what you are asking. There are practical limits such as replication bandwidth (applied at loss) and reporting load (for low-latency operations) that you will run into when exceeding storage boundaries.

See also our Hardware Requirements guide: https://www.cloudera.com/documentation/enterprise/release-notes/topics/hardware_requirements_guide.h...

shamznetz · ‎03-20-2018

Hi,

For a Data node with 100TB of size, how much RAM is required ??

weichiu · ‎03-20-2018

That's mostly a function of blocks stored on a DataNode. For example, a rule of thumb is one GB heap size for DN for every one million blocks stored on that DN.

Harsh J · ‎03-20-2018

Agreed. You shouldn't need more than 3-4 GiB of heap, going by an x3 or x4
factor of ideal block count for that storage (storage divided by block
size).

Tomtong · ‎03-28-2019

Can you provide more information on reporting load (for low-latency operations) issue when we have datanode with 100T+ storage? We need archive node for HDFS storage only purpose. No Yarn/spark running on it. It will only storage data based on storage migration policy. Node's network/storage IO bandwidth is considered be able to handle the larger storage size.

Cloudera Community

Support Questions

Maximum capacity per DataNode