Created 09-29-2015 04:31 PM
End user needs to get files from HDFS. The process is :
End user --> Gateway box ( Look for file locally. If not there then talk to HDFS) --> HDFS --> copy file in gateway box
Created 09-29-2015 04:39 PM
I highly recommend Knox's shell which uses a DSL for those operations http://knox.apache.org/books/knox-0-6-0/user-guide.html#WebHDFS
Great way to programmatically interact with a cluster in a controlled and audited manner (e.g. simpler DSL and secured gateway endpoint, no need to open every node's port). BTW, it's a groovy DSL, which makes it trivial to run in any Java program.
Created 09-29-2015 04:39 PM
I highly recommend Knox's shell which uses a DSL for those operations http://knox.apache.org/books/knox-0-6-0/user-guide.html#WebHDFS
Great way to programmatically interact with a cluster in a controlled and audited manner (e.g. simpler DSL and secured gateway endpoint, no need to open every node's port). BTW, it's a groovy DSL, which makes it trivial to run in any Java program.
Created 09-29-2015 04:57 PM
Thanks @agrande@hortonworks.com
Created 09-29-2015 04:41 PM
An option here would be to put a standard caching web-proxy in-front of webhdfs, or of course better webhdfs through Knox, and have it ignore cache-control headers.