Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

replacement of ec2 machine

replacement of ec2 machine

Explorer

We are succesfully able to run cloudera cluster using director. Great tool indeed.

we want to prepare steps for the replacement of unhealthy ec2 machine or simply upgrade ec2 machine with more powerful machine.

 

What would be standard steps to do same in below cases without having fear of corrupting cluster state :

 

  •  replacing Cloudera manager ec2 machine
  •  replacing any master/worker ec2 machine.

 

Thanks !!

Regards,

Kartik

14 REPLIES 14

Re: replacement of ec2 machine

Cloudera Employee

Hello,

 

To replace an EC2 instance with a more powerful instance, do either of the following:

 

1. Power down the EC2 instance.

2. In the EC2 management console, right click the instance and choose "Change Instance Type."

 

OR

 

1. Create an image (AMI) of the EC2 instance.

2. Launch an instance from the AMI.

 

OR

 

1. Create a snapshot of the EC2 instance's volume.

2. Create a volume from the snapshot.

3. Provision a larger EC2 instance.

4. Power down the EC2 instance.

5. Delete volume from the EC2 instance.

6. Attach the volume, provisioned from the snapshot, to the EC2 instance.

Re: replacement of ec2 machine

Explorer

Thank you, those are standard options but in cloudera context with instance storage(i2/d2 ec2) cases most of these options will not work. As instance storage, we don't have EBS but instance storage. Second, we can't stop any machine in production cluster to increase power or for replacement.

 

Only below options looks good to me.

 

1. Create an image (AMI) of the EC2 instance.

2. Launch an instance from the AMI.

 

I would like to know, if we have new AMI based on some master/worker node and attach that machine to Clouder manager, will it able to work as replacement services ?

 

Cloudera manager is M4.xlarge machine based on EBS volume. I believe, this we will be able to restore based on standard EBS guide line. 

 

 

 

 

 

 

Re: replacement of ec2 machine

Cloudera Employee

For the worknodes (e.g., datanodes), it would be better to add a larger EC2 instance to the cluster, replicate the data to the instance and add, (or transfer) any roles.  Once the smaller EC2 instance is no longer needed, simply decommision/remove it. 

 

Re: replacement of ec2 machine

Expert Contributor

Hello there,

 

There are three distinct replacement scenarios being contemplated here.

 

The easiest is replacing a worker, such as a node that only hosts, say, a node manager and datanode. Rufus's guidance is correct: you should add a new worker built from the larger instance type, with the AMI of your choice, and allow HDFS replication to work. Then it becomes safe to remove older, smaller workers. Note that it is best to do this through Director, using its grow/shrink capability, so that it remains in sync with Cloudera Manager and the true cluster state. You will need to define a new instance template for the larger worker, and define a new instance group with the roles matching the older workers. Documentation on the process is here:

 

https://www.cloudera.com/documentation/director/latest/topics/director_ui_cluster_shrink.html

 

Replacing a master is more difficult, and I don't believe we have a way to do that within Director. I could be wrong, though, so I will check on that. There is some support for replacing failed HDFS masters, but it is in the context of an HA (highly available) cluster.

 

Replacing a Cloudera Manager instance is possible, but it is complex. We currently reserve that procedure for our support personnel to execute.

 

Generally, if you find that your needs have grown to exceed your master nodes and CM instances, you should create new deployments and clusters and move the data out of the old clusters, using distcp or other appropriate tools.

Re: replacement of ec2 machine

Explorer

Thanks Bill. Now i am very determine to keep backup of different service data using own tool capabilities.

Re: replacement of ec2 machine

Explorer

Hi Bill,

 

I tried to do replacement of worker and I was not successful at all. I did below steps. Please advice what went wrong here.

 

  • I had 3 workers running datanodes & region server services.[successful]
  • Added one more workers via update director client command. [successful]
  • run hdfs reblance command after that. [successful]
  • out of 4 workers now, selected one workers from cloudera manager. [successful]
  • stopped roles and pressed decommission button from cloudera manager.[successful]
  • run update command by reduced workers count to 3.  [It was never completed and end up running into several hours.]

 

 

Regards,

Kartik

 

 

Re: replacement of ec2 machine

Explorer

@Bill Havanki

Can you please let me know the steps to be performed incase Cloudera Director instance is down or if we need to replace it.

@kartikbha : Did you try the master node replacement ?If yes can you please share the steps performed to replace ec2 instance of master/CM node .

 

Regards,

Tauqeer Khan

Re: replacement of ec2 machine

Explorer

When migrating a NameNode, co-located Failover Controller should also be migrated. Literally, it is a combo.
Journal node migration can be done either separately or with NameNode.

 

Requirements:
This procedure requires cluster downtime. (Do not stop cluster, just see that no one works on it)

Do the following before you run the wizard:
• On hosts running active and standby NameNodes, back up the data directories.
• On hosts running JournalNodes, back up the JournalNode edits directory.
• If the source host is not functioning properly or is not reliably reachable, decommission the host.(Never do this as other roles get affected, if only NN, FC, JN are present on source host, only then can be chosen )
Running the Migrate Roles Wizard
• If the host to which you want to move the NameNode is not in the cluster, follow the instructions in Adding a Host to the Cluster to add the host.

• Go to the HDFS service.
• Click the Instances tab.
• Click the Migrate Roles button.
• Click the Source Host text field and specify the host running the roles to migrate. In the Search field optionally enter hostnames to filter the list of hosts and click Search.

• Click the Destination Host text field and specify the host to which the roles will be migrated. On destination hosts, indicate whether to delete data in the NameNode data directories and JournalNode edits directory. If you choose not to delete data and such role data exists, the Migrate Roles command will not complete successfully.

• Acknowledge that the migration process incurs service unavailability by selecting the Yes, I am ready to restart the cluster now checkbox.

• Click Continue. The Command Progress screen displays listing each step in the migration process.

• When the migration completes, click Finish.

Re: replacement of ec2 machine

Explorer

 

@kartikbha

Thanks kartik,

 

Did you tried the ec2 instance replacement of the nodes with other master services such as hive,hue spark etc ?

 

I am thinking of the below strategy,let me know your views on it.

  • Will setup snapshot of the ebs volumes for the master nodes with critical services such as hive,hue ,spark etc.
  • If the master node goes down, will recover from failure.

 

Regards,

Tauqeer Khan

@Bill Havanki