- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Reading/Writing data from HDFS via Windows.
- Labels:
-
HDFS
Created ‎11-29-2016 11:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does anyone else have this requirement?
We have a set of automated processes that ingest data from all over the place (on windows) and need a way to reliably load that data in to HDFS
What is the best way to do this? NFS gateway (lack of ACL is a concern here)? HTTPFS?
Cluster is kerberized and CentOS based, if that helps.
Any insights/tips would be very helpful.
Thanks.
Created ‎11-30-2016 12:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There a few solutions out there for windows hadoop client, I haven't tried any of them so, will defer to the community at large for specifics. One elegant/interesting approach that I will point out is the possibility of a flume agent for windows.
Tried and true method:
Off the top of my head, you can use file transfer method of your choice to get the files from the windows machine to a linux machine (SFTP, Samba, etc), and then use your favorite HDFS loading command/process to get the files into HDFS (hdfs dfs -copyFromLocal, flume, etc.)
Created ‎11-30-2016 12:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There a few solutions out there for windows hadoop client, I haven't tried any of them so, will defer to the community at large for specifics. One elegant/interesting approach that I will point out is the possibility of a flume agent for windows.
Tried and true method:
Off the top of my head, you can use file transfer method of your choice to get the files from the windows machine to a linux machine (SFTP, Samba, etc), and then use your favorite HDFS loading command/process to get the files into HDFS (hdfs dfs -copyFromLocal, flume, etc.)
