Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Issues in Moving kudu Installation along with CDH cluster to new Machines keeping the Data disk same

avatar
Contributor

Hi Team,

 

Greetings!

 

There was a need to move the kudu installation from my current cluster of AWS machines to new cluster of  AWS Machines.

So I did the following steps.

  1. Took the back/sanpshot of disk haivng the kudu data Directory.
  2. Installed CDH and kudu on the new AWS cluster and attached the Disk from the older cluster to the newer one.
  3. After the installation I checked for kudu health in Cloudera Manager, kudu cluster was healthy and all new tablet servers were running fine along with master.
  4. when I ran the ksck command all except few the tables were in CONSENSUS_MISMATCH state
  5. Logs suggest the that master is still trying to refer to the tablet servers which is having IP of older machines.

Can anybody help in pointing out why this is happening.

 

 

Regards

Parth

 

2 ACCEPTED SOLUTIONS

avatar
Contributor

Kudu doesn't support swapping a drive to a new host.

 

Kudu tablet servers store the consensus configuration of their tablets, and the master also stores the consensus configuration for the tablets. By moving all the servers, you changed all the hostnames, and now the cluster is in total disarray. It's possible to rewrite the consensus configuration of the tablets on the tablet servers, but I'm not sure there's currently a way to rewrite the data in the master. So, by scripting `kudu local_replica cmeta rewrite_raft_config` you could fix the tablet servers. You will need to rewrite the config of each tablet so the hostnames are mapped from the old servers to the new servers. If you do that correcty and the tablet replicas are able to elect a leader, the leader will send updated information to the master, which should cause it to update its record. I don't think too many people have ever tried anything like this, so there may be other things that need to be fixed, or it simply might not be possible to recover the cluster.

 

What you should have done is set up the new cluster, then transferred the data via Spark or an Impala CTAS statement, or you should have built the new cluster as an expansion of the existing one, and then decommissioned all the tablet servers of the old cluster, and then moved the master nodes one-by-one to the new cluster.

View solution in original post

avatar
Contributor
For now. I think the two possible methods I outlined might work. Additionally, you could export the data to something like parquet or avro, using Spark or Impala, and then reload the data in the new cluster.

View solution in original post

3 REPLIES 3

avatar
Contributor

Kudu doesn't support swapping a drive to a new host.

 

Kudu tablet servers store the consensus configuration of their tablets, and the master also stores the consensus configuration for the tablets. By moving all the servers, you changed all the hostnames, and now the cluster is in total disarray. It's possible to rewrite the consensus configuration of the tablets on the tablet servers, but I'm not sure there's currently a way to rewrite the data in the master. So, by scripting `kudu local_replica cmeta rewrite_raft_config` you could fix the tablet servers. You will need to rewrite the config of each tablet so the hostnames are mapped from the old servers to the new servers. If you do that correcty and the tablet replicas are able to elect a leader, the leader will send updated information to the master, which should cause it to update its record. I don't think too many people have ever tried anything like this, so there may be other things that need to be fixed, or it simply might not be possible to recover the cluster.

 

What you should have done is set up the new cluster, then transferred the data via Spark or an Impala CTAS statement, or you should have built the new cluster as an expansion of the existing one, and then decommissioned all the tablet servers of the old cluster, and then moved the master nodes one-by-one to the new cluster.

avatar
Contributor

Thanks wdberkeley 

I will try out the options suggested by you.

Can you suggest the best practices or the some options to do the following.

  • My kudu cluster managed through Cloudera Manager and CDH is currently hosted in AWS
  • I want to move it from AWS to Azure or someother cloud service provider.
  • I will try options suggested by you for the above scenario but is there any other thing that you would like to suggest.

 

Regards

Parth

avatar
Contributor
For now. I think the two possible methods I outlined might work. Additionally, you could export the data to something like parquet or avro, using Spark or Impala, and then reload the data in the new cluster.