Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

SOLVED Go to solution

Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

Expert Contributor

I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.

 

I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 100000 -fileSize 50

 

and 

 

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 500 -fileSize 10GB

 

without good results.

 

I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.  

 

Any suggestions would help greatly.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

Expert Contributor

Ok, the setup is simple you just create datanodes with 1 TT on namenode which took the network to 3500MB to other nodes which worked

3 REPLIES 3

Re: Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

Expert Contributor

I would also like to cut down on local hhd writes so I can evenly distribute data to all machines so I get more network traffic as well.

 


@nauseous wrote:

I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.

 

I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 100000 -fileSize 50

 

and 

 

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 500 -fileSize 10GB

 

without good results.

 

I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.  

 

Any suggestions would help greatly.

 

 



@nauseous wrote:

I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.

 

I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 100000 -fileSize 50

 

and 

 

sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar  TestDFSIO -write -nrFiles 500 -fileSize 10GB

 

without good results.

 

I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.  

 

Any suggestions would help greatly.

 

 




Re: Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

Expert Contributor

I found a way to increase network performance but only for write. When I run a read dfsio it only seems to be sending to the local drive from 1 system and not reading from multiple systems. I need the system to read through the network and not locally can any body help on how to force network reads using dfsio?

Re: Need to get maximum network performance with Cloudera CDH5 from a 40G network on RedHat 6.6

Expert Contributor

Ok, the setup is simple you just create datanodes with 1 TT on namenode which took the network to 3500MB to other nodes which worked