New Contributor
Posts: 1
Registered: ‎06-04-2018

Cloudera to HDP SOLR(version 5.5.2) Data Migration | Failed to Update solr indexes after restoration

[ Edited ]

SOLR version - 5.5.2


My Project requirement is to transfer solr cloud indexes from cloudera cluster to HDP cluster.

  • Data is huge(1 billion indexed records on production), hence re-indexing is not an option.

We have tried solr restore and backup APIs but data is not visible on cloud. Please check if we are missing any step from below ==>

1) Allowed snapshot (Cloudera cluster) :
sudo -u hdfs hadoop dfsadmin -allowSnapshot /user/solr/CollectionName

2) Created snapshot :
sudo -u hdfs hadoop dfs -createSnapshot /user/solr/CollectionName/

3) Created solr collection on HDP cluster : with same name, same number of shards and replicas.

4) Used “distcp” to transfer snapshot :
sudo -u solr hadoop distcp hdfs://NameNodeCDH-IP:8020/user/solr/CDHCollectionName/.snapshot/s20180601-131020.000 hdfs://NameNodeHDP-IP:8020/user/solr

5) Restore snapshot on collection level :
sudo -u solr hadoop fs -cp /user/solr/s20180601-131020.000/* /user/solr/HDPCollectionName/ Restored snapshot from /user/solr to collection directory for each shard and replica.


OUTCOME : HDFS directory restored but DATA not visible on SOLR UI. 0 records displayed. Checked HDFS directory using-
sudo hadoop fs -du -s -h /user/solr/HDPCollectionName/



SOLR Cloud UI  -- 0 indexes 





Are we missing something here?