- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solr disaster recovery at CDH 5.4.8
Created on ‎03-15-2017 04:51 AM - edited ‎09-16-2022 04:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have CDH5.4.8 PRO Cluster and we have setup CDH5.4.8 DR machines for Disaster Recovery.Now we want solr instance at the both cluster need to sync on Index and collection inorder to CDH 5.4.8 DR machines provide service as like CDH5.4.8 PRO machine on DownTime.
We like to know answer for the below questions?
1. Simply copying the PRO machine index and collection folder of hdfs to DR Cluster. will it work?
2. Is it any possibility there to make both CDH 5.4.8 and CDH 5.4.8 DR machine always sync on index and collection.
3. What is the recommeded way to take backup of PRO solr indexes and collection to DR Cluster.
Created ‎03-15-2017 03:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This will not work unfortunately. the solr index and tlog files are in a constant state of being updating, and there is no way to ensure a consistent snapshot while solr is running. This could be done if solr was shut down, however, the core_node directories that exist under the /solr/<collection_name> in hdfs are mapped to specific shards/replicas, and you would have to ensure that when creating the corresponding collection in DR, that you map the core_node directories to the same shards/replicas at collection creation time.
2. Is it any possibility there to make both CDH 5.4.8 and CDH 5.4.8 DR machine always sync on index and collection.
Prior to CDH 5.9, the best way to do this is to have your indexing jobs publish documents to both collections. As of CDH5.9, there is the ability to backup and restore collections, either locally or in DR: https://www.cloudera.com/documentation/enterprise/5-9-x/topics/search_backup_restore.html
3. What is the recommeded way to take backup of PRO solr indexes and collection to DR Cluster.
If you can't upgrade to CDH5.9, then the recommended way to backup the solr indexes is to stop the solr service and do an hdfs snapshot or distcp to copy the indexes to a backup location. For the backup location, if you need to run the same collection there, you would need to create it with the createNodeSet property for Solr 4.10.3 to ensure the collection gets created on the proper nodes, and you'd have to verify that the core_noden directories map to the same shards in the clusterstate.json as whats in production.
-pd
Created ‎08-01-2017 12:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Following blog helped me to setup disaster recovery for solr
https://blog.cloudera.com/blog/2017/05/how-to-backup-and-disaster-recovery-for-apache-solr-part-i/
Created ‎03-15-2017 03:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This will not work unfortunately. the solr index and tlog files are in a constant state of being updating, and there is no way to ensure a consistent snapshot while solr is running. This could be done if solr was shut down, however, the core_node directories that exist under the /solr/<collection_name> in hdfs are mapped to specific shards/replicas, and you would have to ensure that when creating the corresponding collection in DR, that you map the core_node directories to the same shards/replicas at collection creation time.
2. Is it any possibility there to make both CDH 5.4.8 and CDH 5.4.8 DR machine always sync on index and collection.
Prior to CDH 5.9, the best way to do this is to have your indexing jobs publish documents to both collections. As of CDH5.9, there is the ability to backup and restore collections, either locally or in DR: https://www.cloudera.com/documentation/enterprise/5-9-x/topics/search_backup_restore.html
3. What is the recommeded way to take backup of PRO solr indexes and collection to DR Cluster.
If you can't upgrade to CDH5.9, then the recommended way to backup the solr indexes is to stop the solr service and do an hdfs snapshot or distcp to copy the indexes to a backup location. For the backup location, if you need to run the same collection there, you would need to create it with the createNodeSet property for Solr 4.10.3 to ensure the collection gets created on the proper nodes, and you'd have to verify that the core_noden directories map to the same shards in the clusterstate.json as whats in production.
-pd
Created ‎08-01-2017 12:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Following blog helped me to setup disaster recovery for solr
https://blog.cloudera.com/blog/2017/05/how-to-backup-and-disaster-recovery-for-apache-solr-part-i/
