Created on 03-16-2015 02:13 PM - edited 09-16-2022 02:24 AM
I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.
I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 100000 -fileSize 50
and
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 500 -fileSize 10GB
without good results.
I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.
Any suggestions would help greatly.
Created 03-24-2015 12:27 PM
Ok, the setup is simple you just create datanodes with 1 TT on namenode which took the network to 3500MB to other nodes which worked
Created 03-16-2015 03:18 PM
I would also like to cut down on local hhd writes so I can evenly distribute data to all machines so I get more network traffic as well.
@nauseous wrote:I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.
I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 100000 -fileSize 50
and
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 500 -fileSize 10GB
without good results.
I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.
Any suggestions would help greatly.
@nauseous wrote:I'm trying to get maximum throughput with Cloudera on RedHat 6.6 on 6 - Dell R730's with kernel 3.18.1, and using 2 - 850MB, 3G ssd transfer per second hhd with modified drivers which have been tested. Currently I've tried decommissioning "mapReduce tasktracker" on all nodes except 1 single node as suggested but didn't really make any differences in nic speed. I want to max out the connection speed on all nodes if possible.
I've tried : sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 100000 -fileSize 50
and
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-test-2.5.0-mr1-cdh5.3.1.jar TestDFSIO -write -nrFiles 500 -fileSize 10GB
without good results.
I've already tested throughput with netperf but can't seem to get cloudera to perform network tests to maximum level like I have with netperf using dfsio.
Any suggestions would help greatly.
Created 03-17-2015 02:11 PM
I found a way to increase network performance but only for write. When I run a read dfsio it only seems to be sending to the local drive from 1 system and not reading from multiple systems. I need the system to read through the network and not locally can any body help on how to force network reads using dfsio?
Created 03-24-2015 12:27 PM
Ok, the setup is simple you just create datanodes with 1 TT on namenode which took the network to 3500MB to other nodes which worked