Support Questions
Find answers, ask questions, and share your expertise

load big data remotely to HDFS



I was able to load data remotely through the webhdfs rest api ,but it doesn't allow to load a big volume of data remotely.Is there any possibilty to load huge data remotely ,i ask if there is a hadoop api to do that ?

Thank you


@Maher Hattabi

what is the error you are hitting? Can you paste the error ?

Also what is the size of data you are loading in hadoop ?


@Sagar Shimpi i am using the webhds api through .net and c# source code.when i use a little size of data , the operation of the load through remote works well but when i tried about 850MB a csv file it doesn't work , i have tried several example of volumes but there is still problem ,am sure my code because i tried it with a little csv file (little size) is there other api of hadoop to access a remote hadoop cluster ? Thanks

You always have scalability problems with the REST API, and possibly timeout issues as well. Have you considered installing a HDFS client on the machine where you have the data, and using the native protocol?


@Hellmar Becker

@Hellmar BeckerYou mean like creating a cluster and ensure the duplication of the data in order to be reachable by the remote hadoop cluster ? and use the normal load operation ?

No, I mean have a single edge node that has only a Hadoop client installed (via Ambari or manually) and where you have the files to be uploaded available. Then upload the files in native hdfs protocol, which makes use of the distributed nature of Hadoop.