Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to troubleshoot poor throughput reported by DFSIO

Highlighted

How to troubleshoot poor throughput reported by DFSIO

Expert Contributor

I ran DFSIO test on a newly created 4-node basic cluster(all 4 nodes are datanodes and all master services are co-located). Below is the output. I noticed the throughput and avg IO rate for "write" are very low in comparison to "read". What could be the possible reasons and is there any way to improve them?

----- TestDFSIO ----- : read
           Date & time: Wed Jan 18 16:50:33 PST 2017
       Number of files: 10
Total MBytes processed: 1000000.0
     Throughput mb/sec: 82.24845564131054
Average IO rate mb/sec: 89.47832489013672
 IO rate std deviation: 26.504227991353375
    Test exec time sec: 1569.043

----- TestDFSIO ----- : write
           Date & time: Wed Jan 18 16:24:20 PST 2017
       Number of files: 10
Total MBytes processed: 1000000.0
     Throughput mb/sec: 12.744570863790308
Average IO rate mb/sec: 12.745065689086914
 IO rate std deviation: 0.07952523184580733
    Test exec time sec: 7922.071

1 REPLY 1

Re: How to troubleshoot poor throughput reported by DFSIO

Expert Contributor

Can you check if you have these setup these on sysctl

net.core.somaxconn = 4096 & net.ipv4.tcp_fin_timeout = 10. Also we have some suggested values for dirty ratio (50) and background ratio (20).

You can also run other performance test on local disk instead of HDFS via sysbench / FIO.