I'm working on the design for a very large (50TB+) sharepoint document repository which will hold documents and files for many years. The use of SharePoint as the front-end is a mandatory requirement for access and retrieval, but I'm looking at options for how to store the historical data.
SharePoint can use Remote Blob Storage which is a way of storing the documents (word, excel, pdf, logfiles, etc) somewhere other than the SQLserver database.
Does anyone know of a SharePoint RBS provider that works with Hadoop? Or does anyone have any real experience of storage SharePoint data on Hadoop while still being able to use SharePoint to search and retrieve it?