Hello all, questions about the Oryx project (https://github.com/cloudera/oryx) are also welcome here. At the moment this serves as the official mailing list and forum for the project.
Also feel free to ask on Quora (http://www.quora.com/Oryx-Machine-Learning-project) or StackOverflow (http://stackoverflow.com/questions/tagged/oryx). We're watching.
There is no specific separate Quick Start VM, but, there is no need for one really. The standard Quick Start VM is fine. Oryx is just two binary .jar files to run (https://github.com/cloudera/oryx/releases) so just download them anywhere onto the VM and run them per the instructions. There's nothing more that would have needed setup in a canned VM anyway.
Yes, any CDH 4.x version should be fine. If you are using MR1, you need the different MR1 compiled binary. If you use CDH5, you need the Hadoop 2.2+ binary. These are all available as downloads.
I tried setting up Oryx on CDH quick start VM and I ran a sample collaborative filtering example with ALS algorithm on local mode which came out good.
But when i try with an actual data with 300 MB local mode , i do not see any progress(X,Y folder cteation etc), i only see stats.json, computation.conf file & _SUCEESS(0KB) only getting created , nothing else , How can i ensure the computation is running, is there any other location where logs are generated?
It would be ideal to set up a new thread in this forum for a new question, by the way. If you could please do that, and record some more basic info about how you're running this -- local or distributed mode? -- I could try to help.