Created on 12-06-2014 11:13 PM - edited 09-16-2022 02:14 AM
I have created a multi-node cluster using CD v1.0.1 and now find that the size of the cluster is causing data timeouts during Hadoop benchmarking runs. One of our AWS experienced engineer pointed out that the lack of a placementGroup on the workers is probably what is causing us problems.
I thought when I went through the CD instalation instructions that the placement group it created would be applied to my cluster but it wasn't. I am modiying my aws.conf file to explicitly call out the AWS-PLACEMENT-GROUP CD created and attempting to create a new cluster. However the workers still come up with not Placement Group associated to with them. They are cc2.8xlarge and the AWS on-line docs says this instance should work with placement groups. I tried setting my conf file like so, per the doc spelling:
workers {
count: 48
#
# Minimum number of instances required to set up the cluster.
# Fail and quit if minCount number of instances is not available in this cloud
# environment. Else, continue setting up the cluster.
#
minCount: 24
instance: ${instances.cc2xl} {
tags {
group: worker
name: "Hadoop Worker"
}
}
roles {
HDFS: ${roles.HDFS_WORKERS}
YARN: ${roles.YARN_WORKERS}
HBASE: ${roles.HBASE_WORKERS}
}
placementGroup: "AWS-PLACEMENT-GROUP-us-west-2-11-13-2014"
}
Created 12-09-2014 03:15 PM
So turns out this works. As usual it was an operator error. What I should have done was put the placementGroup keyword inside the instance container. Once I moved the setting into the worker instance container, rerunning the bootstrap created 4 workers with the correct placementGroup setting.
Here is what I did (compare it to the original pasted info):
instance: ${instances.cc2xl} {
tags {
group: worker
Name: "Hadoop Worker-XX"
}
placementGroup: "AWS-PLACEMENT-GROUP-us-west-2-11-13-2014"
}
In hindsight, makes perfect sense. I didn't think this out logically when first creating the configuration
Created 12-08-2014 06:01 PM
Thanks for reporting! We are still testing this out and we will reply with more details soon.
When using Placement Groups you should know about the following limitations:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html
Created 12-08-2014 06:49 PM
Thanks for the feedback. Yes, I visited that link when I started researching the use of the placmentGroup tag. My instance is one of the compute type instances the link says is supported, as I mentioned in my rambling opening post.
Created 12-09-2014 03:15 PM
So turns out this works. As usual it was an operator error. What I should have done was put the placementGroup keyword inside the instance container. Once I moved the setting into the worker instance container, rerunning the bootstrap created 4 workers with the correct placementGroup setting.
Here is what I did (compare it to the original pasted info):
instance: ${instances.cc2xl} {
tags {
group: worker
Name: "Hadoop Worker-XX"
}
placementGroup: "AWS-PLACEMENT-GROUP-us-west-2-11-13-2014"
}
In hindsight, makes perfect sense. I didn't think this out logically when first creating the configuration
Created 12-09-2014 03:27 PM
Great! I've also missed that in your initial post. This sounds like something we can warn about during configuration file parsing. I will file an internal feature request to improve it.
Do placement groups improve things from a network latency perspective for you as expected?