Support Questions

Find answers, ask questions, and share your expertise

Retrieve and modify latent feature vectors on the fly ?

avatar
Explorer

Sean,
 
We are investigating the ways to retrieve and modify the latent feature vectors on the fly...
One practical case is to resolve "cold start" problem.
For example, given a new item without any user-item associations, we want to "approximate" the item's latent vector.
An idea is taking the new item and comparing it's similarity (attribute based similarity, not latent feature similarity) to other items. Then, get the k-NN items' latent vectors to approximate the new items' "latent" vector.   Basically, it's a k-NN based approach.
There could be other approaches. Anyway, let say, the new item's latent vector is estimated somehow. Then, we want to "insert" this new entry to existing item latent vectors (Y matrix). I understand there is no end-point API to do that. However, is it feasible to work around, say, using the Java jar level (computation/serving  jar) to getY and then modify Y matrix and "save" it back ? Any suggestions are welcome.
 
Thanks.
Jason

2 ACCEPTED SOLUTIONS

avatar
Master Collaborator

It is not hard to expose, but seems like an internal implementation detail. The implementation already solves the cold start problem in a different way with fold-in. One issue with what you're suggesting is that there is no notion of attributes in the model. I assume you mean you have that externally. I understand the logic but it's a fairly different recommender model that you're making then. I think I'd direct you to just hack the code a bit. But I can keep this in mind in case several other use cases pop up that would make it make sense to just let the item vectors be set externally.

 

The oryx2 design is much more decomposed so you could put in another process that feeds any item/user updates you want onto a queue of updates. But this is a ways from being ready.

View solution in original post

avatar
Master Collaborator

Yes, that's right.

View solution in original post

15 REPLIES 15

avatar
Explorer

Sean,

 

Thanks for your reply.

And, sorry for the late response.

I was trying to implement a kNN based approach to approximate the latent features for a cold start new user.

Then, now, I started to dig into the Oryx to see where to insert the new user's latent features into X matrix.

 

Following your hints, I checked setPreference code and I have two questions:

 

(1) I noticed that a code line as below and traced the related codes. I do not get why there is a CandidateFilter involved and "where" the item will be added to. Will it "append" a new entry to Y matrix in order to host the latent feature for the new item ?

 

if (newItem) {
  generation.getCandidateFilter().addItem(itemID);
}

(https://github.com/cloudera/oryx/blob/master/als-serving/src/main/java/com/cloudera/oryx/als/serving...

 

(2) Follwing the question in (1), why there is no similar process to "newUser". Say, if there is a user-item association and the user-ID does not exist in the model, why there are statements such as.... if (newUser) { // do something }

 

Thanks a lot.

 

 

 

avatar
Master Collaborator

Yes, it becomes a new "row" in Y. The candidate filter is something else. This line of code is like a callback notifying the implementation that a new item exists. New users also cause a new row in X, but there is no equivalent 'candidate filter' for users because the same types of operations (recommend, etc.) are not supported for users.

avatar
Explorer

Thanks.

 

Reagarding what you mentioned "New users also cause a new row in X", just want to confirm with what I traced (please let me know if my following understading is not right)... Thanks.

 

My tracing indicates that...

 

 

(1) The "new X feature vector" is added in

https://github.com/cloudera/oryx/blob/master/als-serving/src/main/java/com/cloudera/oryx/als/serving...

(2) The feature is updated in the following with foldIn

https://github.com/cloudera/oryx/blob/master/als-serving/src/main/java/com/cloudera/oryx/als/serving...

(3) I do not need to worry about "candidate filter", because it's used for item..

 

avatar
Master Collaborator

Yes, that's right.

avatar
Explorer

Sean,

 

One more thing to follow up... about Lock..

For example, I want to write an entry to X matrix, so I make something like this

 

Lock writeLock = generation.getXLock().writeLock();
writeLock.lock();

// write to X matrix

writeLock.unlock();

 

(1) My understanding is that then writeLock.lock() can avoid simultaneous write from other writer. Right ?

(2) How about readLock ? Can you explian the siatuation to use readLock?

Lock readLock = generation.getXLock().readLock();
readLock.lock(); 

 

 

avatar
Master Collaborator

This is just Java's locking library, it's not specific to the project. This is a lock that supports many readers at one time, but, only one writer at a time (and no readers while a writer has the write lock). You have to acquire the write lock to mutate the shared state, but also need to acquire the read lock to read it -- but, you won't exclude other readers.