Support Questions

Find answers, ask questions, and share your expertise

load data from remote machine to HDFS via NIfi

Explorer

Hello

Is it possible to load data from remote machine to HDFS ?If so the processor used for this is PutHDFS .But i have read about puthdfs in this community blog that is not right option to do dis?

Please suggest about this.

3 REPLIES 3

Master Guru

@hema moger

There is not much detail in your use case provided.

-

What is the source of the data?

Do you plan on manipulating and/or filtering the data in anyway before writing to HDFS?

Is this a one time move or a sustained dataflow?

-

Bottom line is the NiFi putHDFS processor can be used to write data to HDFS. Before NiFi can do this, it must ingest the data from some source. As part of this ingestion, the data is written to the NiFi content repository. A NiFi FlowFile(s) are generated. NiFi will also track lineage information about every FlowFile as it traverses any number of processor between ingestion and termination in NiFi.

Thank you,
Matt

Master Guru

@hema moger

HCC Tip: Try to avoid responding to an "answer" with a new answer. Instead add a comment to the answer. This maintains an easy to follow continues thread when multiple users provide different answers.

If all you are doing is essentially copying data from point a to point B, there may be faster solutions that do not provide the control and lineage that Nifi provides on top of that process. You response sounds like a continues flow of data with new data being transferred every hour. NiFi provides an intuitive user interface that makes building and modifying dataflows in real time very easy, so setting up and scheduling the dataflow you described would be very easy to do. In addition, NiFi's UI makes it very easy to monitor the dataflow in action. I have no idea the volume of data or the size of this data you are transferring.

Thank you,

Matt

*** If you find an answer to addresses your original question, please take a moment to login and click the "accept" link for that answer.

Explorer
@Matt Clarke

I am getting below error while transferring data from remote machine to Hdfs (which is in different machine).and i am giving different name still facing the error file name already exists.

and when i check the path in hdfs result is not saved in that path.

Kindly help.

65031-data1.png