New Contributor
Posts: 1
Registered: ‎04-24-2014

Launching an EC2 cluster with Whirr



I'm trying to launch an EC2 cluster using Apache Whirr. My configuration is as follows:


whirr.instance-templates=1 hadoop-namenode+yarn-resourcemanager+mapreduce-historyserver,2 hadoop-datanode+yarn-nodemanager


I'm following the docs showed in this link ( that. My cluster seems to be up after a few minutes, so I can ssh-connect both the master and the 2 slaves nodes. I even have access to the ResourceManager Web UI and the Namenode. Services as datanodes are also up. Unfortunately when I tried to run a yarn job from command line I get the next error:


-bash-3.2$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount input output
14/04/24 14:14:49 WARN conf.Configuration: is deprecated. Instead, use dfs.metrics.session-id
14/04/24 14:14:49 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
14/04/24 14:14:49 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/user/ivan814841750/.staging/job_local814841750_0001
14/04/24 14:14:49 ERROR security.UserGroupInformation: PriviledgedActionException as:ivan (auth:SIMPLE) cause:ENOENT: No such file or directory
ENOENT: No such file or directory
at Method)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
at org.apache.hadoop.fs.FilterFileSystem.setPermission(
at org.apache.hadoop.fs.FileSystem.mkdirs(
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
at org.apache.hadoop.mapreduce.Job$
at org.apache.hadoop.mapreduce.Job$
at Method)
at org.apache.hadoop.mapreduce.Job.submit(
at org.apache.hadoop.mapreduce.Job.waitForCompletion(
at org.apache.hadoop.examples.WordCount.main(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(
at org.apache.hadoop.util.ProgramDriver.driver(
at org.apache.hadoop.examples.ExampleDriver.main(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.RunJar.main(


I have some other doubts about whirr (I'm very new to this technology) and launching a cluster with it:

1)  After I launch my cluster, Whirr gives me the ssh command I have to run in order to have access to my nodes. The problem I have is that even I can have access to them I would like to access directly to Amazon using those keys. I have noticed that in my AWS console, cluster's instances have a different key pair (something like jclouds#myhadoopcluster#a7f). As long as it's not possible to download those keys, I was wondering how to initialize the cluster witth a valid keypair to access to AWS machines afterwards.


2) Another issue that I have find is that even that I have tried several Amazon images to run the cluster, all of them have failed but the one is pointed in the cloudera docs. Is there any list where valild AMI are defined?





Posts: 1,903
Kudos: 435
Solutions: 307
Registered: ‎07-31-2013

Re: Launching an EC2 cluster with Whirr

Am unsure on your "other doubts" so I'll defer to an AMZN expert on those.

For your posted error though, it appears the node you're submitting the job from does not have proper client configurations under /etc/hadoop/conf/. The job's trying to run locally, which should not happen. You may want to check your local configuration files, especially mapred-site.xml should be carrying a "" property with its value as "yarn", aside of fully configured yarn-site, hdfs-site and core-site xmls.