SOLR version - 5.5.2
My Project requirement is to transfer solr cloud indexes from cloudera cluster to HDP cluster.
We have tried solr restore and backup APIs but data is not visible on cloud. Please check if we are missing any step from below ==>
1) Allowed snapshot (Cloudera cluster) :
sudo -u hdfs hadoop dfsadmin -allowSnapshot /user/solr/CollectionName
2) Created snapshot :
sudo -u hdfs hadoop dfs -createSnapshot /user/solr/CollectionName/
3) Created solr collection on HDP cluster : with same name, same number of shards and replicas.
4) Used “distcp” to transfer snapshot :
sudo -u solr hadoop distcp hdfs://NameNodeCDH-IP:8020/user/solr/CDHCollectionName/.snapshot/s20180601-131020.000 hdfs://NameNodeHDP-IP:8020/user/solr
5) Restore snapshot on collection level :
sudo -u solr hadoop fs -cp /user/solr/s20180601-131020.000/* /user/solr/HDPCollectionName/ Restored snapshot from /user/solr to collection directory for each shard and replica.
OUTCOME : HDFS directory restored but DATA not visible on SOLR UI. 0 records displayed. Checked HDFS directory using-
sudo hadoop fs -du -s -h /user/solr/HDPCollectionName/
SOLR Cloud UI -- 0 indexes
Are we missing something here?
try to follow this guide and let me know if works for you (https://blog.cloudera.com/blog/2017/05/how-to-backup-and-disaster-recovery-for-apache-solr-part-i/)... Probably you can't able to see anything into the SOLR Cloud UI, because you didn't restore any information about the clusterstate.json and the core.properties for each collection (stored locally into each Solr Server into /var/lib/solr/) in the "old" infrastructure.
With the procedure that you described above you only copied the data (index) of the collections, but the metadata are missing...