Support Questions

Find answers, ask questions, and share your expertise

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Cloudera Community
- :
- Support
- :
- Support Questions
- :
- Oryx 1 ALS computation with Hadoop

Announcements

Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Oryx 1 ALS computation with Hadoop

Explorer

Created 08-28-2015 02:10 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Sean,

I want to confirm one thing.

In Oryx 1 ALS, there are iterations to compute latent features for users and items.

Say, during the computation for latent features for an item, the user latent features matrix X (computed so far) is fixed, and the item's

latent feature is updated (as the Least Squares regression)...

In Hadoop case, say, I have 5 yarn nodes, in the implementation, is the same X matrix passed to each of Hadoop node when it's running the ALS for items

(that's , there are 5 copies of X and each one is inside each Hadoop node) ?

Thanks.

Jason

5 REPLIES 5

Highlighted
##

Yes, that's how it works in 1.x. Really, a subset of X is passed to workers depending on which rows they will need, so it's not the whole matrix. This subset is loaded into memory, so it's doing a fast but memory-hungry in-memory join.

Re: Oryx 1 ALS computation with Hadoop

Master Collaborator

Created 08-29-2015 12:47 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Highlighted
##

Re: Oryx 1 ALS computation with Hadoop

Explorer

Created 08-29-2015 09:31 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

When it solves by normal equaltion, wouldn't it need the whole X matrix for XtX (or YtY) ?

Highlighted
##

That's right, but this term can be pre-computed fairly efficiently and sent to the workers. XtX is quite small relative to X since the matrix is tall and skinny.

Re: Oryx 1 ALS computation with Hadoop

Master Collaborator

Created 08-29-2015 09:43 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Highlighted
##

Re: Oryx 1 ALS computation with Hadoop

Explorer

Created 08-30-2015 09:51 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks.

A related question.

I notice the regularization parameter lamda is weighted by ru

This is different from original Hu's implicit feedback paper.

Any idea why we want to weight lamda ?

Highlighted
##

Re: Oryx 1 ALS computation with Hadoop

Master Collaborator

Created 08-30-2015 11:00 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Yes, this is the 'weighted regularization' idea grafted on from another paper, http://link.springer.com/chapter/10.1007%2F978-3-540-68880-8_32

MLlib does it too and explains one pretty good reason for it: http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#scaling-of-the-regularization...

You could also say it exists to not heavily favor fitting the preferences of prolific users.

Coming from Hortonworks? Activate your account here