I want to confirm one thing.
In Oryx 1 ALS, there are iterations to compute latent features for users and items.
Say, during the computation for latent features for an item, the user latent features matrix X (computed so far) is fixed, and the item's
latent feature is updated (as the Least Squares regression)...
In Hadoop case, say, I have 5 yarn nodes, in the implementation, is the same X matrix passed to each of Hadoop node when it's running the ALS for items
(that's , there are 5 copies of X and each one is inside each Hadoop node) ?
Yes, that's how it works in 1.x. Really, a subset of X is passed to workers depending on which rows they will need, so it's not the whole matrix. This subset is loaded into memory, so it's doing a fast but memory-hungry in-memory join.
That's right, but this term can be pre-computed fairly efficiently and sent to the workers. XtX is quite small relative to X since the matrix is tall and skinny.
A related question.
I notice the regularization parameter lamda is weighted by ru
This is different from original Hu's implicit feedback paper.
Any idea why we want to weight lamda ?
Yes, this is the 'weighted regularization' idea grafted on from another paper, http://link.springer.com/chapter/10.1007%2F978-3-540-68880-8_32
MLlib does it too and explains one pretty good reason for it: http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#scaling-of-the-regularization...
You could also say it exists to not heavily favor fitting the preferences of prolific users.