I want to load data(~300 GB) from local filesystem to HDFS. And i will be doing similar activity once every month.
What would be the feasible way to get this done. I am looking at Flume & HDFS Put options.
These are some files (XML) and not log data. I dont need any conversion, its a straight copy to HDFS.
My personal preference is HDFS put over flume if those are the options. Even better would be HDF, but it sounds like a simple HDFS put would solve it
View solution in original post