Member since
07-29-2013
366
Posts
69
Kudos Received
71
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5137 | 03-09-2016 01:21 AM | |
4330 | 03-07-2016 01:52 AM | |
13643 | 02-29-2016 04:40 AM | |
4056 | 02-22-2016 03:08 PM | |
5065 | 01-19-2016 02:13 PM |
12-07-2014
02:10 PM
No, I see it finish at 6 iterations with MAP about 0.15 on Hadoop. Same data set? double-check that you have the latest build and maybe start from scratch with no other intermediate results.
... View more
12-07-2014
08:31 AM
OK, I'm pretty certain I found the bug and fixed it here: https://github.com/cloudera/oryx/commit/437df94d0b1c9d27b5c9f3b984b98973237d6f99 The factorization works as expected now. Are you able to test it too?
... View more
12-07-2014
04:09 AM
I think you can perhaps see in the servlets under als-serving/ how they access the data structure for a generation, which includes a map from IDs to float[] (latent feature vectors) in memory. You could just clone how one of the servlets works and is initialized, and change it to return feature vectors.
... View more
12-05-2014
04:38 PM
I'm certain it's nothing to do with the input itself. It looks fine and those types of problem would be different.
... View more
12-05-2014
01:56 PM
Strange, I do indeed get much different answers on the Hadoop version and they don't look quite right. The first row and column are very small and there's no good reason for that. I'll keep digging in to see where things to funny. The fact that MAP is good suggests that the model is good during iteration but something funny happens at the end.
... View more
12-05-2014
12:40 PM
You are right that it's unlikely that the earlier computations would work if the data was low rank. OK synthetic data is ruled out. It's not quite X or Y that is singular or nonsingular, it's X'*X and Y'*Y. Small absolute values in the matrices are normal.
... View more
12-05-2014
12:27 PM
Hm, I wonder if the jobserver just needs to be updated in the VM. You could try building and running your own. I've not used the jobserver myself. Someone else may have more insight or it may be a good question for the VM forum.
... View more
12-05-2014
08:19 AM
Is this data synthetically generated? I'm also wondering if somehow it really does have rank less than about 6. That's possible for large data sets if they were algorithmically generated.
... View more
12-05-2014
08:04 AM
Got it, spark-submit works, but you want to use the jobserver. This could be my ignorance but I thought you still had to build and install the jobserver yourself. Hue doesn't seem to have it in CDH 5.2 but I haven't looked at the VM in a while. Are you building jobserver yourself or no?
... View more
12-05-2014
07:52 AM
Yes, that's fine then. You do not need to build against the CDH artifacts, even. You do need to use spark-submit. It could be an issue with the jobserver? how are you deploying that, and is it consistent with your CDH installation?
... View more