Member since
07-29-2013
366
Posts
69
Kudos Received
71
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5086 | 03-09-2016 01:21 AM | |
4300 | 03-07-2016 01:52 AM | |
13538 | 02-29-2016 04:40 AM | |
4018 | 02-22-2016 03:08 PM | |
5017 | 01-19-2016 02:13 PM |
03-05-2014
08:06 AM
Ha, you mean is it memoryless? Yes. You can just ship the whole generation directory around. To roll back generations, just delete newer ones and restart the serving layer. etc.
... View more
03-05-2014
02:48 AM
The project uses Maven for builds. Eclipse supports Maven builds directly, so therefore supports Eclipse. (Or if there are any problems, it is in the Eclipse integration with Maven.) I use IntelliJ and recommend it, but, there is nothing additional to say here about working with Eclipse. It just works, because the Maven build works, and Eclipse uses Maven builds. Have you modified the build? You say for example that you had to add the avro plugin to pluginManagement, but it is certainly already there in the parent POM: https://github.com/cloudera/oryx/blob/master/pom.xml#L671 You say there is a compile error, but the dependencies you seem to be missing are correctly specified in the POM already. That is: you can verify that the build is correct with "mvn compile". You need to clarify what you are doing because I don't think you are building the project as-is.
... View more
03-05-2014
12:11 AM
It would be helpful to know what errors you are getting. It should build fine with Maven, as you show.
... View more
02-18-2014
09:53 AM
Hard filtering rules need to be implemented in a RescorerProvider, or in logic on the caller side. Tagging users and items with a locale could make sense. It would function as a soft filter nudging people towards things in the same locale. That could be useful as well, but is a different thing from implementing business rules. If your items and users are nearly completely disjoint by locale (e.g. very few items are available in multiple locales and very few users shop in multiple locales) then separate models might be the best way to go. No filtering logic needed although you then manage a model per locale. But the models are smaller and easier to handle. If there is moderate overlap, then a unified model can benefit from the cross-locale learning.
... View more
02-13-2014
01:17 AM
1 Kudo
Yes, that's the same me. Please have a look at the project page for some simple examples to get started: https://github.com/cloudera/oryx There is no Mongo integration per se, but you could always hack on it.
... View more
02-10-2014
08:50 AM
That's good, although I am still not sure why it worked fine for me with quite different params. The transformation should not have done much. It could be that the singularity tolerance is too strict, but I doubt it. There's going to be a fairly big rewrite of the computation, to use Spark in some parts for example. As part of that I am going to build in evaluation to the pipeline itself, so that it's always tuning as it goes. It's not going to come out soon -- just in design phase -- but the idea is that this should not be something anyone has to do by hand. For practical purposes, I would just proceed with these params for now and return to the idea of optimization later. I am guessing (?) your real data set is different anyway and would require different params. Or for this data set you could use the local build.
... View more
02-10-2014
03:29 AM
It's "normal" for this result to happen if the parameters are way out of kilter for the data set. I suppose it tends to be easier for that to happen with small data. So whether it's reproducing a problem depends on the data. But if you think the params are quite reasonable for the data and you see this, yes please send it to me.
... View more
02-09-2014
06:28 AM
1 Kudo
This is good. There is no performance difference between computing 10 and 100 recommendations since it still considers all non-filtered items each time. (OK I suppose it takes a tiny bit longer to send 100 results over the network than 10.) The results are not precomputed but computed on the fly each time.
... View more
02-08-2014
03:08 PM
Oops, fixed. Yes I'm using CDH5b1 too, so that's not a difference. Can you compile from HEAD to make sure we're synced up there? you may already be, just checking. I can make a binary too. Any logs would be of interest for sure. I suppose I would suggest trying again with clearly small values for features (like 10) and clearly small values for lambda (like 0.0001) to see if that at least works. I would expect a lower number of features might be appropriate given there are a smallish number of items. You might try the optimizer again with lower ranges for both. More features encourages overfitting and more lambda encourages underfitting, so they kind of counter-act. It's possible you find a better value when both are low.
... View more
02-08-2014
01:18 PM
Yes that explains why you didn't see the same initial problem. Well, good that was fixed anyhow. Text vs numeric shouldn't matter at all. Underneath they are both hashed. Looks the amount of data and its nature are the same if it's just that IDs were hashed. I can't imagine collisions are an issue. I tried converting these 1-1 to an ID that is alphanumeric, and it worked for me. You are using CDH 4.x vs 5 right? could be a different, but still don't quite expect a problem would be of this form. Anything else of interest in the logs? you're welcome to send me all of it. You're starting from scratch when you run the test ?
... View more