I have two problems after installing CDH 5.4.7 in Amazon EC2 instances. One is java initialization error during hive command, second one is hadoop version mismatch.
What I did?
I spinned up 5 instances in EC2 ( Data node configuration: four t2.medium, OS Ubuntu 12.04, 4 gb ram ,10GB ssd. Name Node Config: One m3.large, 8 GB ram, 32 gb ssd, ubuntu 12.04 ) and downloaded cloudera.bin using
I ran this binary and cluster setup went well and got complete.
I then ssh'ed into one of my ec2 instance typed in hive and i got the below error
Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:58) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 0.20.2-cdh3u5 at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:169) at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:134) at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:95) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:356) ... 8 more
Problem 2: I ran hadoop version to check the installed version and shocked because in GUI i picked CDH 5.4.7 and the installed version seems to be CDH3 which is really weird. So is this some problem with cloudera installation binary itself? Kindly help in resolving both.
~# hadoop version
Hadoop 0.20.2-cdh3u5 Subversion file:///data/1/tmp/nightly_2012-10-05_17-10-50_3/hadoop-0.20-0.20.2+923.421-1~maverick -r 30233064aaf5f2492bc687d61d72956876102109 Compiled by root on Fri Oct 5 18:46:31 PDT 2012 From source with checksum de1770d69aa93107a133657faa8ef467
Your post has been moved to the Cloudera Manager forum. Cloudera Manager, as you've observed, will consume EC2 instances and set up CDH clusters on them. In case you're not already aware of it, Cloudera Director will provision EC2 instances for you and then use Cloudera Manager to deploy CDH clusters on those instances. Cloudera Director provides lifecycle management of cloud instances, including operations such as growing and shrinking of clusters. You can download for free and learn more about Cloudera Director from here: http://www.cloudera.com/content/cloudera/en/products-and-services/director.html.
Thanks for moving to the right section. By the way, I tried downloading cloudera director earlier and it throws error from both deb and yum repo's. Also, the link you have given doesn't show any links to download. I even tried setting the cloudera director in deb lists in ubuntu , still unsuccessful.
I tried installing cloudera manager binary in GCE and it shows current version and my problem is solved. But with AWS, I still have issues and hadoop version mismatch issue and cdh version issue exists.
So if someone can try installing latest cloudera manager in EC2 and confirm whether its a bug that would be helpful.