<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: What are the best practices/guidelines for solr replication across data-centers? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150513#M32557</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/64/cnormile.html" nodeid="64"&gt;@cnormile&lt;/A&gt;  In case of a disaster if dr becomes primary what needs to be changed? Is there any document elaborating this? What happens if we schedule this for off peak and the primary fails in peak hours. will that data be lost?&lt;/P&gt;</description>
    <pubDate>Tue, 21 Jun 2016 22:39:46 GMT</pubDate>
    <dc:creator>rbiswas1</dc:creator>
    <dc:date>2016-06-21T22:39:46Z</dc:date>
    <item>
      <title>What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150511#M32555</link>
      <description>&lt;P&gt;Hi, &lt;/P&gt;&lt;P&gt;What are the best practices/guidelines for solr replication across data-centers (primary and dr)?  I did found Cross Data Center Replication feature for solr 6.0. Has anyone used it successfully in production environment?&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;Thanks for looking.
&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Raj&lt;/DIV&gt;</description>
      <pubDate>Tue, 21 Jun 2016 22:25:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150511#M32555</guid>
      <dc:creator>rbiswas1</dc:creator>
      <dc:date>2016-06-21T22:25:47Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150512#M32556</link>
      <description>&lt;P&gt;Falcon can be used to replicate Solr transaction logs and index.    If the index is active, replication may fail and be automatically retried.  Therefore, it's best to schedule replication for off-peak periods.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 22:34:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150512#M32556</guid>
      <dc:creator>cnormile</dc:creator>
      <dc:date>2016-06-21T22:34:34Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150513#M32557</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/64/cnormile.html" nodeid="64"&gt;@cnormile&lt;/A&gt;  In case of a disaster if dr becomes primary what needs to be changed? Is there any document elaborating this? What happens if we schedule this for off peak and the primary fails in peak hours. will that data be lost?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 22:39:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150513#M32557</guid>
      <dc:creator>rbiswas1</dc:creator>
      <dc:date>2016-06-21T22:39:46Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150514#M32558</link>
      <description>&lt;P&gt;For high availability with Solr, the best practice is probably using SolrCloud.  I believe with SolrCloud, you let Solr handle the replication by creating additional shards.  The Solr docs have more info (http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-5.2.pdf).&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 23:12:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150514#M32558</guid>
      <dc:creator>cnormile</dc:creator>
      <dc:date>2016-06-21T23:12:18Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150515#M32559</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/64/cnormile.html" nodeid="64"&gt;@cnormile&lt;/A&gt; Which version of Falcon supports this?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 23:52:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150515#M32559</guid>
      <dc:creator>rbiswas1</dc:creator>
      <dc:date>2016-06-21T23:52:03Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150516#M32560</link>
      <description>&lt;P&gt;Note that Solr Cloud's replication is not intended to go across data centers due to volume of traffic and dependency on zookeeper ensembles. However, the recently released 6.x added a special replication to go across data centers.&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/SOLR-6273,"&gt;https://issues.apache.org/jira/browse/SOLR-6273,&lt;/A&gt; which is based on this description:  &lt;A href="http://yonik.com/solr-cross-data-center-replication/"&gt;http://yonik.com/solr-cross-data-center-replication/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Basically, this is a cross-cluster replication, which is different from the standard Solr Cloud's replication mechanism.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Jun 2016 00:25:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150516#M32560</guid>
      <dc:creator>james_jones</dc:creator>
      <dc:date>2016-06-22T00:25:15Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150517#M32561</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3902/rbiswas.html" nodeid="3902"&gt;@rbiswas&lt;/A&gt;, You may have read this but there's some good info here in what they describe as a "real world" production configuration using the new cross-data-center replication: &lt;A href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462"&gt;https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Since this feature only came out in 6.0 which was released less than 2 months ago, there's probably been limited production use.&lt;/P&gt;&lt;P&gt;ALSO....Not a best practice, but since way before Solr Cloud existed, we used a brute force method of cross-data-center replication for stand-by Solrs with the magic of rsync. You can reliably use rsync to copy indexes as they are being updated, but there's a bit of scripting required. &lt;/P&gt;&lt;P&gt;I have only done this in non-cloud environments, but I'm pretty sure it can be done in cloud as well. It is crude, but it worked for years and uses some of the great features of linux.&lt;/P&gt;&lt;P&gt;Example script, run in crontab from the DR site nodes:&lt;/P&gt;&lt;PRE&gt;#step 1 - create a backup first, assuming your current copy is good. 
cp -rl ${data_dir} ${data_dir}.BAK

#step 2 - Now copy from the primary site
status=1
while [ $status != 0 ]; do
    rsync -a --delete ${primary_site_node}:${data_dir} ${data_dir}
    status=$?
done
echo "COPY COMPLETE!"

&lt;/PRE&gt;&lt;P&gt;That script will create local backup (instantly via hard-links, not soft links) and then copies [only] new files and deletes files from DR that are have been deleted from Primary/remote. If files disappear during the rsync copy, it will copy again until nothing changes during the rsync. This can be run from crontab, but it does need a bit of bullet-proofing. &lt;/P&gt;&lt;P&gt;Simple. Crude. It works.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2016 10:20:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150517#M32561</guid>
      <dc:creator>james_jones</dc:creator>
      <dc:date>2016-06-23T10:20:22Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices/guidelines for solr replication across data-centers?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150518#M32562</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2229/jamesjones.html" nodeid="2229"&gt;@james.jones&lt;/A&gt; if you can put this as an answer, I will accept. Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2016 21:59:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-guidelines-for-solr-replication/m-p/150518#M32562</guid>
      <dc:creator>rbiswas1</dc:creator>
      <dc:date>2016-06-23T21:59:13Z</dc:date>
    </item>
  </channel>
</rss>

