Member since
01-01-2014
4
Posts
0
Kudos Received
0
Solutions
01-08-2014
11:32 AM
This is helpful. To be clear I do not have issues in installing Client SW on the Apple. I just do not want to use it as a cluster node. If I understand Cloudera terminology, just as a Gateway. I'm looking for WebHDFS (thanks for pointing that to me!) and looks great. The only issue so far seems a good tool to to "things" in the cluster (like create directories / files etc.), but I haven't seen any example of using WebHDFS to launching a .jar file with code... I've also started to research MrJobs that looks quite promising. I wonder if anybody has used MrJobs from a Gateway-type (client-type) node and Cloudera...
... View more
01-03-2014
02:23 AM
Why would be necessary installing *all* of hadoop on the client ? My understanding is that intstalling these client files it all what I need on the client side (e.g. my apple mac)? For example, Cloudera Manager provides client files - I assume for this use cases only? http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.3/Cloudera-Manager-Enterprise-Edition-User-Guide/cmeeug_topic_5_9.html
... View more
01-02-2014
09:57 AM
I do not intend to install hadoop on OSX. I just would like to install the client libraries that - as I understand - are needed with certain packages etc. My idea would be to write; test and debug the code on the mac, to then execute it on the VM, ideally launching it from OSX. As far as python, I refer to libraries like MrJobs or pydoop and similar.
... View more
01-01-2014
10:29 AM
I would like to write mapreduce code – ideally using python – on my apple mac to streaming it on the QuickStart VM. Ideally my development setup is using my Apple Mac python environment & the QuickStart VM (later to be expanded to a cluster). While there are many description on how to connect or stream code from within a node of the hadoop cluster or sandbox (e.g. from the NameNode etc.), I am unclear on what to do to connect just as a client. E.g. I assume I need to install some hadoop client libraries on my OsX to talk to the Sandbox HDFS? Where do I get these libraries from? How do I install them? What type of python package works best? What IP address should I use to stream my python code? Any help – and any link to a tutorial covering this – would be great!
... View more