Support Questions

Find answers, ask questions, and share your expertise

Why Standalone NiFi (PutHDFS) is slow ingesting small files in HDFS

avatar
Contributor

Having Problem with the slowness of PutHDFS to HDFS.

15042-file1.png

15043-file2.png

1 ACCEPTED SOLUTION

avatar
Master Guru

Have you tried installing the hadoop client on the same machine as NiFi and then sending a similar file to HDFS from the command line? In most cases the issue here is the network or disk.

View solution in original post

3 REPLIES 3

avatar
Master Guru

Have you tried installing the hadoop client on the same machine as NiFi and then sending a similar file to HDFS from the command line? In most cases the issue here is the network or disk.

avatar
Contributor

@Bryan Bende

Btw, most of the file is small around 10~50 mb,

Upon testing:

350 Files, 10 mb per file, 1Gbps Network

Client to Server (SSH) - 36 secs

[admin@client1 data]$ cat transfer.logs Fri May 5 09:42:40 +08 2017 Fri May 5 09:43:16 +08 2017 36 secs

Server to HDFS - 4,479 secs or 1.27 Hrs

[hadoop@master1 data]$ ./puthdfs-stdf.sh Fri May 5 09:55:53 +08 2017 Fri May 5 11:10:32 +08 2017 4479 sec

[client@master1 data]$ cat puthdfs-stdf.sh
start=`date +%s`
echo `date`
hdfs dfs -put *.txt /data/test/testing/
echo `date`
end=`date +%s`
runtime=$((end-start))
echo $runtime

I haven't install Hadoop client in NIFI Server (a windows server)

avatar
Contributor

@Bryan Bende Thanks!

Slow HDFS Put cause by a slow BlockReceiver.

2017-06-05 13:17:34,210 WARN  datanode.DataNode (BlockReceiver.java:receivePacket(571)) - Slow BlockReceiver write packet to mirror took 6316ms (threshold=300ms)