Support Questions

Find answers, ask questions, and share your expertise

Cloudera cluster on AWS spot instance

avatar
Explorer

Hi,

We are thinking of doing performance testing on a spot instances for the entire cluster.We were able to automate out deployment using simple conf  for ec2 instance (as mentioned here https://github.com/cloudera/director-scripts/blob/master/configs/aws.simple.conf) but due to  high cost of system we have moved most of our other products to spot instances but  unable to move full cloudera clustera. 

Like we were able to move cloudera director but manager or datanodes 

After going through this reference document it seems that moving full cluster to spot instance is not supported 

https://github.com/cloudera/director-scripts/blob/master/configs/aws.reference.conf

Would like following inputs /suggestion from you

1. does deployment support full cluster to be in spot instance .

2. reference document states that Only stateless roles can be used with Spot instances can somebody explain this (sorry if this is a lame question)

Thanks

Anil

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

I was able to solve this issue by using various combination and this is how it worked

Can we make all the instances in a cluster to spot instances (for testing scenario onl) the anwser is yes

In the configuration file 

The key think you have to remember is 

1. for the instance attribute of any type of instance say master you have to have the following keyword

   useSpotInstances: true
   spotBidUSDPerHr: 2.760(This is the spot price of the instance that you are using)

So the pseudo structure would be like  this

 

  workers-spot {
      count: 10
      #
      # Minimum number of instances required to set up the cluster.
      # Fail and quit if minCount number of instances is not available in this cloud
      # environment. Else, continue setting up the cluster.
      #minCount is set to 0 always when  using spot instance.
      minCount: 0


      instance: ${instances.d24x} {
          
useSpotInstances: true #required for spot instance spotBidUSDPerHr: 2.760 #reauired for spot instance. tags { Name: "regionserver-REPLACE-ME" Owner: "owner-REPLACE-ME" } } roles { HDFS: [DATANODE] YARN: [NODEMANAGER] HBASE: [REGIONSERVER] } # Optional custom role configurations # Configuration keys containing periods must be enclosed in double quotes. configs { HBASE { REGIONSERVER { hbase_regionserver_handler_count: 64 } } } } postCreateScripts: ["""#!/bin/sh

Hopefully this helps some one.

Thanks

Anil

 

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

reference document states that Only stateless roles can be used with Spot instances can somebody explain this (sorry if this is a lame question)

 

A stateless role is something which as the name suggests doesnt hold any state. i.e. there is no data on it and mostly works as a compute node.  One of the reasons why it is recommended only to use stateless roles on SPOT instances is that the SPOT instances can go away at anytime. The data on it is also lost. If you have data on it, then you will have some dataloss and that is not an ideal situation. However, if a stateless role is lost, you can create a new one in its place and the system will continue running as long as there is minimum servers present.

 

Regarding the 1st question, I am not sure since I dont work directly with director. Hope atleast one question is resolved. 

avatar
Explorer

I was able to solve this issue by using various combination and this is how it worked

Can we make all the instances in a cluster to spot instances (for testing scenario onl) the anwser is yes

In the configuration file 

The key think you have to remember is 

1. for the instance attribute of any type of instance say master you have to have the following keyword

   useSpotInstances: true
   spotBidUSDPerHr: 2.760(This is the spot price of the instance that you are using)

So the pseudo structure would be like  this

 

  workers-spot {
      count: 10
      #
      # Minimum number of instances required to set up the cluster.
      # Fail and quit if minCount number of instances is not available in this cloud
      # environment. Else, continue setting up the cluster.
      #minCount is set to 0 always when  using spot instance.
      minCount: 0


      instance: ${instances.d24x} {
          
useSpotInstances: true #required for spot instance spotBidUSDPerHr: 2.760 #reauired for spot instance. tags { Name: "regionserver-REPLACE-ME" Owner: "owner-REPLACE-ME" } } roles { HDFS: [DATANODE] YARN: [NODEMANAGER] HBASE: [REGIONSERVER] } # Optional custom role configurations # Configuration keys containing periods must be enclosed in double quotes. configs { HBASE { REGIONSERVER { hbase_regionserver_handler_count: 64 } } } } postCreateScripts: ["""#!/bin/sh

Hopefully this helps some one.

Thanks

Anil