Support Questions

Find answers, ask questions, and share your expertise

Copying the file from Local to hdfs is taking huge time

avatar
New Contributor

I am trying to write the 750 MB file in the hdfs by simply using the hadoop fs -put command but it is taking huge time and it is showing the below log it is showing.

18/05/25 18:30:40 WARN hdfs.DFSClient: Slow waitForAckedSeqno took 52640ms (threshold=30000ms)

3 REPLIES 3

avatar
Master Mentor

@Saurabh Srivastava

The maximum transmission unit (MTU) could have been set too low the default is 1500 bytes, change it to 9000 recommended see HCC document Typical HDP Cluster Network Configuration Best Practices do this for all the nodes in the cluster

Check the current MTU setting with ifconfig or ip link list command under Linux look at the fourth line

# /sbin/ifconfig
eth1      Link encap:Ethernet  HWaddr 08:00:27:D8:06:8F
          inet addr:192.168.0.171  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fed8:68f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:786 errors:0 dropped:0 overruns:0 frame:0
          TX packets:444 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:455353 (444.6 KiB)  TX bytes:57860 (56.5 KiB) 

You can also use

$ ip link list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:d8:06:8f brd ff:ff:ff:ff:ff:ff 

As you see, MTU set to 1500 for eth1. You should optimize to let's say 9000 then you can use any one of the following commands to setup MTU:

# ifconfig eth1 mtu 9000 

or

# ip link set dev eth1 mtu 9000 

Verify that new MTU is setup with the following command:

$ ip link list 

or

$ /sbin/ifconfig 

Edit /etc/sysconfig/network-scripts/ifcfg-ethx (Red Hat Linux ) to permanently change the MTU,notice I have added MTU=9000

DEVICE=eth0 
HWADDR=08:00:27:FF:AF:39 
TYPE=Ethernet 
UUID=4df8046e-59c0-4667-9a6e-daa61b83682c 
ONBOOT=no 
MTU=9000 
NM_CONTROLLED=yes 
BOOTPROTO=dhcp 

Restart the network on Redhat/Centos

# service network restart 

Reference: https://www.pcmech.com/article/jumbo-frames/

Now retry you copy, Hadoop 2.0 should be good with big files

Please let me know if the copy timing improved drastically

avatar
Master Mentor

@Saurabh Srivastava

I noticed you created 2 similar threads, please can you delete the other one.

avatar
Master Mentor

@Saurabh Srivastava

Any updates?