Support Questions

Find answers, ask questions, and share your expertise

How NiFi writes data into HDFS? whats the internal process?

New Contributor

How NiFi writes data into HDFS? whats the internal process it uses?

1 REPLY 1

If PutHDFS is used; https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/...

In the above lines from the source we can see;
1. The flowfile is written to HDFS as a tmp file
2. The tmp file is renamed to the desired filename
3. A log entry is created, which looks as follows for a successful operation:

 getLogger().info("copied {} to HDFS at {} in {} milliseconds at a rate of {}"

Or in case of failure:

getLogger().error("Failed to write to HDFS due to {}", new Object[]{t});

In the top of the source code, import org.apache.hadoop.fs.FileSystem; is performed, which is used for the hdfs.create. More info on this base class; https://hadoop.apache.org/docs/r2.8.2/api/org/apache/hadoop/fs/FileSystem.html