Support Questions

Srini_D · ‎10-29-2014

Hi I ran Distributed Recommender using RecommendorJob class and provided movielens data as input. The data is having 943 users but the reocmmendations come out for only some where around 500+ users. Am i doing mistake any where, i provided the CSV file with item id, user id and preference value input.

srowen · ‎10-29-2014

By default the distributed implementations will prune infrequent items and low similarity to scale up. With a tiny data set, this can mean some items are removed entirely, or become un-recommendable. While you can change this behavior, it is not useful to run the Hadoop-based job on such a small data set.

View solution in original post

srowen · ‎10-29-2014

By default the distributed implementations will prune infrequent items and low similarity to scale up. With a tiny data set, this can mean some items are removed entirely, or become un-recommendable. While you can change this behavior, it is not useful to run the Hadoop-based job on such a small data set.

Srini_D · ‎10-29-2014

Oh is it so.. But, How can we alter this behaviour to get recommendations for all users.?

Srini_D · ‎10-29-2014

The dataset is having 100k records, but they correspond to 943 users and 1682. It will create vectors from the data and performs the similarity measure using them. Yes, its minue considering how much the distributed programming can scale upto, but interested to know how to alter the behaviour.

Cloudera Community

Support Questions

Distributed Recommender not giving recommendations for all users