Is there a way to migrate kudu tables from one cluster to another cluster with data?
there is a solution I'm going to test mentioned in
the main idea is to create a backup with spark
move it with distcp
then restore your backup
I have tested the backup/restore solution and seems to be working like charm with spark :
-First, check and record the names as given in the list of the kudu_master (or the primary elected master in case of multi masters )
-Download the kudu-backupX.X.jar in case you can't find it in /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/ and put it there
-In kuduMasterAddresses you put the name of your Kudu_master or the names of your three masters separated by ','
sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduBackup /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS
sudo -u hdfs hadoop distcp -i - hdfs:///PATH_HDFS/DB.TABLE hdfs://XXX:8020/kudu_backups/
sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduRestore /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE