Support Questions
Find answers, ask questions, and share your expertise

Why Standalone NiFi (PutHDFS) is slow ingesting small files in HDFS

Contributor

Having Problem with the slowness of PutHDFS to HDFS.

15042-file1.png

15043-file2.png

1 ACCEPTED SOLUTION

Have you tried installing the hadoop client on the same machine as NiFi and then sending a similar file to HDFS from the command line? In most cases the issue here is the network or disk.

View solution in original post

3 REPLIES 3

Have you tried installing the hadoop client on the same machine as NiFi and then sending a similar file to HDFS from the command line? In most cases the issue here is the network or disk.

Contributor

@Bryan Bende

Btw, most of the file is small around 10~50 mb,

Upon testing:

350 Files, 10 mb per file, 1Gbps Network

Client to Server (SSH) - 36 secs

[admin@client1 data]$ cat transfer.logs Fri May 5 09:42:40 +08 2017 Fri May 5 09:43:16 +08 2017 36 secs

Server to HDFS - 4,479 secs or 1.27 Hrs

[hadoop@master1 data]$ ./puthdfs-stdf.sh Fri May 5 09:55:53 +08 2017 Fri May 5 11:10:32 +08 2017 4479 sec

[client@master1 data]$ cat puthdfs-stdf.sh
start=`date +%s`
echo `date`
hdfs dfs -put *.txt /data/test/testing/
echo `date`
end=`date +%s`
runtime=$((end-start))
echo $runtime

I haven't install Hadoop client in NIFI Server (a windows server)

Contributor

@Bryan Bende Thanks!

Slow HDFS Put cause by a slow BlockReceiver.

2017-06-05 13:17:34,210 WARN  datanode.DataNode (BlockReceiver.java:receivePacket(571)) - Slow BlockReceiver write packet to mirror took 6316ms (threshold=300ms)
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.