Reply
Explorer
Posts: 27
Registered: ‎01-06-2016

Oryx2 ALS recommender

[ Edited ]

(New thread at request - https://community.cloudera.com/t5/Data-Science-and-Machine/Oryx2-Kafka-Broker-Issue/m-p/56556)

 

Hi srowen,

I've a general question about the ALS recommender. Let's say instead of a specific user rating, I'd like to recommend based on a usage count i.e. everytime a user interacted with an item, I submit a new ingestion: userID,itemID (with this type of input the "strength" is assigned a default value of 1).

 

I've seen that if I submit multiple requests like this for the same userID,itemID combo that /recommend/userID/?considerKnownItems=true will return itemID with a higher recommended score. 

 

Is this based on a cumulative "strength" i.e. multiple "ratings" of the same item adding up, or is it based on the fact that the item was "rated" more recently than other items? Or is it something else?

 

Alternatively, on ingestion, would it make sense to get the current "rating" of an item by a user and increment that by 1 before submitting it? 

Does my use case make sense or would it be suited to a different type of algorithm? 

Thanks

 

======

 

Response 

(Please start a new thread)

Yes, all scores are cumluative and added across all input. I am not sure what your use case, but what you're suggesting is how it works: submitting (user, item, 1) adds 1 to the total strength of interaction.

 

 

 

Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Hi srowen,

thanks for the previous response. Here's my use case. I have a collection of articles (in my mind in the ALS example this correlates to users). For each article I have a list of related search terms (items in the ALS example). The search terms are not unique to an article (i.e. two articles could have the same search term) but there is a finite set of search terms. 

 

Currently, in my system, if a search term on an article is clicked I submit an ingestion (article, term, 1). I've also tweaked /recommend to only return items that appear in that article's known items. So, for any article, I can request a list of the most clicked search terms for that article. 

 

My questions is this. Would the ordering of terms returned for one article be influenced by the number of interactions of a term in another article? e.g. if I had the following interactions

 

(articleA, termA, 1)
(articleA, termB, 1)

(articleA, termA, 1)
(articleB, termB, 1)
(articleB, termB, 1)
(articleB, termB, 1)

(articleB, termB, 1)

 

would termB appear higher than termA in articleA's recommendations because of articleB's interactions?

 

If so, would there be a way to enable/disable that?

 
Perhaps I may be thinking of using the ALS example in a way which it is not intended to be used?


 

 

 

 

 

Cloudera Employee
Posts: 461
Registered: ‎08-11-2014

Re: Oryx2 ALS recommender

If you really mean you want a list of most-clicked search terms per article, you don't need a recommender for that. Just count the clicks. If you mean you want a list of recommended other search terms to pursue from that article, then yeah that makes sense. However you might consider not filtering out search terms that you've never encountered before. Maybe there is a good latent connection in there.

 

Ordering does not matter, except for events that are intended to delete a user or item of course. Things are summed otherwise. In a sense, all data affects all recommendations. It's not likely true in your case that the article B activity has a dominant effect on article A, no, if these are just a few of many articles. So I'm not sure what you mean to disable or enable.

Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Thanks again for the quick response. 

I have left the option open for not filtering out search terms never encountered before. We will probably look at utilising that feature eventually.

Thanks for the clarification of the data interactions.

The reason I am trying to get a broader understanding is that eventually we would want to introduce article classifications (science/sport/music/etc)

e.g. if termA has a large amount of clicks for articleA, and termA is also linked with articleB, and articleB is the same classification as articleA, boost termA in articleB's list of items.

I am unsure how to go about this approach.
Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Hey srowen, 

I noticed you mentioned "Ordering does not matter, except for events that are intended to delete a user or item of course.". How do I go about deleting a user or item from the system? I see on the Ingest documentation:

 

  • userID,itemID,: interpreted as a "delete" for the user-item association. timestamp is current system time

 

But if I wanted to completely delete a user?

Cloudera Employee
Posts: 461
Registered: ‎08-11-2014

Re: Oryx2 ALS recommender

Well, in theory you'd have to delete all the user's items. In practice, you probably don't want to do that. If the user's inactive, that doesn't mean their data is unuseful. You just don't query for that user. Eventually their data decays away anyway.

Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Ok, thanks. I guess that "decaying" is possibly handled by the Kakfa retention time? 

 

Also, have you any input about the article classfication I mentioned? I've seen that the documentation discusses the use of a RescorerProvider/Rescorer to filter results (the example given uses user location). How would I incorporate the article's classification into the current system? Would it require possibly building different models for each classification? 

Sorry for all the questions, I'm new to this and I appreciate you taking the time to answer. 

Cloudera Employee
Posts: 461
Registered: ‎08-11-2014

Re: Oryx2 ALS recommender

Decay is per-batch, and not connected to wall-clock or event time. You can inject whatever logic you like in a Rescorer, including location info. How it changes the score is up to your business logic. It doesn't necessarily mean building several models. It really depends on what you are trying to do. If you just mean to filter out results that are not suitable, that's simple filtering logic, nothing to it.

Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Thanks. 

 

For the use case I described above, I guess the simple approach to take is to simply introduce a rescorer to just filter out items that do not match the classfication of the current article. 

Explorer
Posts: 27
Registered: ‎01-06-2016

Re: Oryx2 ALS recommender

Sorry, that was a gross oversimplification on my part. I see now that that appraoch does not make sense. 

 

The use case is that I would like to only return items that were recommended based on ratings by users (articles) of the same classification. 

 

When we are given a list of recommended items we don't actually know the users (articles) that rated that item which has resulted in the item being recommended. So how could I filter the items with a rescorer in this case?

 

This is why I asked about building different models. Does it make sense to build a different model for each article classification? 

Announcements