Member since
09-26-2015
135
Posts
85
Kudos Received
26
Solutions
About
Steve's a hadoop committer mostly working on cloud integration
06-06-2018
11:57 AM
Dominika: I need to add: S3 is not a real filesystem. You cannot safely use AWS S3 it as a replacement for HDFS without a metadata consistency layer, and even then the eventual consistency of S3 updates and deletes cause problems. you can safely use it as a source of data. To use as a direct destination of work takes care: consult the documentation specific to the version of Hadoop you are using before trying to make S3 the default filesystem. Special case: third party object stores with full consistency. The fact that directory renames are not atomic may still cause problems with commit algorithms and the like, but the risk of corrupt data in the absence of failures is gone.
... View more
12-14-2015
07:30 PM
It's actual title is "Hadoop and Kerberos: The Madness Beyond the Gate" —there's an HP Lovecraft theme of "forbidden knowledge which will drive you insane" which is less a joke and more commentary. it's actually rendered on gitbook If you are working with Kerberos, get a copy of the O'Reilly Hadoop Security book too. My little e-book was written to cover the bits that was left out: to extend rather than replace. Finally, being open source: contributions are welcome
... View more
12-14-2015
07:26 PM
thank you. View it as working notes to avoid me having to send emails to colleagues trying to understand things. And being working notes, it only covers the problems I've encountered. There are many more out there, and in fact I am having serious problems with Kerberos right now which have even me defeated. So don't expect it to solve all your problems.
... View more