Posts: 19
Registered: ‎05-09-2017

How is throughput calculated



A cluster is receiving data from only flume and want to calculate throughput for flume writing to HDFS to gauge network, io, storage, etc. for capacity planning for a new environment 


How can i calculate this ? 


I did some work and please let me know if i am going in right direction. 


Datanodes: 8
Each datanode has 12 disks: 2.0T Each
Toal disk space: 24T
8 cores CPU with Hyperthreading(16 cores)
Physical Memory : 62GB per datanode

I see metrics from HDFS: Total bytes written across datanodes(1d): 2.2Mb/sec - Is this correct metric to report ? 


I have a question regarding the time duration to select form these charts ?