Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

SolrCloud Replication factor with index files in HDFS

avatar
Guru

Hortonworks has a tutorial that shows how to configure Solr to store index files in HDFS. Since HDFS is already a fault tolerant file system, does it mean that with this approach we can keep the replication factor of 1 for any collections (shards) that we create? It sounds like a lot of redundancy if we keep the default HDFS replication factor of 3 plus Solr replication on top of that.

1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Master Mentor
@Jeremy Dyer

This is good read and may help to decide the serving layer. If you are storing data and index on HDFS then I will go with 1.

https://community.hortonworks.com/questions/4858/solrcloud-performance-hdfs-indexdata.html#answer-48...

avatar

Thanks for adding this, this is a good source. We covered a lot of replication and SolrCloud topics in there 🙂

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar

Update regarding the HDFS Replication configuration for solr files, there is an open Jira for this SOLR-6305 ("Ability to set the replication factor for index files created by HDFSDirectoryFactory")