Reply
Explorer
Posts: 19
Registered: ‎05-09-2017

How is throughput calculated

Hello,

 

A cluster is receiving data from only flume and want to calculate throughput for flume writing to HDFS to gauge network, io, storage, etc. for capacity planning for a new environment 

 

How can i calculate this ? 

 

I did some work and please let me know if i am going in right direction. 

 

Datanodes: 8
Each datanode has 12 disks: 2.0T Each
Toal disk space: 24T
8 cores CPU with Hyperthreading(16 cores)
Physical Memory : 62GB per datanode


I see metrics from HDFS: Total bytes written across datanodes(1d): 2.2Mb/sec - Is this correct metric to report ? 

 

I have a question regarding the time duration to select form these charts ?

 

 

Announcements