Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oryx : Collaborative filetring/Recommendation

Highlighted

Oryx : Collaborative filetring/Recommendation

Explorer

Hi,

 

I tried setting up Oryx on CDH quick start VM and I ran a sample collaborative filtering example with ALS algorithm on local mode which came out good.

 

But when i try with an actual data with 300 MB local mode , i do not see any progress(X,Y folder cteation etc), i only see stats.json, computation.conf file &  _SUCEESS(0KB) only getting created ,  nothing else , How can i ensure the computation is running, is there any other  location where logs are generated?

 

Oryx.conf

***********

model=${als-model}
model.instance-dir=/home/cloudera/oryx/example
model.local-computation=true
model.local-data=true
model.features=25
model.lambda=0.065

 

/home/cloudera/oryx/example/00000
[root@localhost 00000]# ls -lrt
total 16
drwxrwxr-x 2 cloudera cloudera 4096 Mar 24 08:26 inbound
-rwxr-xr-x 1 cloudera cloudera 5252 Mar 24 08:35 computation.conf
-rwxr-xr-x 1 cloudera cloudera    0 Mar 24 08:36 _SUCCESS
-rwxr-xr-x 1 cloudera cloudera   50 Mar 24 08:36 stats.json

 

Regards,

Suresh

5 REPLIES 5

Re: Oryx : Collaborative filetring/Recommendation

Master Collaborator

What is in inbound? file names, contents? One possibility is that the file isn't named in a recognized way.

What's in the log output from the time when it should build the model?

Re: Oryx : Collaborative filetring/Recommendation

Explorer

Sean,

 

I think i had the file in .dat format , which i changed to .csv and this time it worked. Is it mandatory to have the file in CSV always?

 

Also , I had  13292825 distinct users & 13558965 distinct items in the file but  the presentation prints log as below.

 

INFO: All model elements loaded, 5167 users and 157950 items

The recommendation is same for most of the random users i have tried, this doesnt match with the Mahout recommendation that i have got . Am i missing something.?

 

Regards,

Suresh

Re: Oryx : Collaborative filetring/Recommendation

Master Collaborator

Yeah it's expecting .csv, .zip or .gz files. You can change the delimiter that it expects in the config file. I'm questioning whether it really makes sense to filter on file name here at all.

 

Are you sure the data is in CSV? and maybe there is still some mismatch between what's being read and what you put in.

 

Start from an empty directory, and add directory "00000/inbound" under it. Put your data files there. Make sure your config file points to the root directory above 00000. That's a good way to make sure it's all from a clean slate.

 

Maybe you can show some of your data lines to double check that?

Re: Oryx : Collaborative filetring/Recommendation

Explorer

Thanks  for your prompt response,

 

oryx.conf

************

model=${als-model}
model.instance-dir=/home/cloudera/oryx/Trade
model.local-computation=true
model.local-data=true
model.features=25
model.lambda=0.065

Data

 

[cloudera@localhost oryx]$ head -10 Trade/00000/inbound/GroomedTradeDataForMahout.csv
904012,1043343,0
1207700547,6162255,2990
1213547853,-1,31407260.12
97850005,8639263,9900
119320001,5750515,1536300
1207700547,5845959,5300
97850005,5755036,781752
1207700519,6166071,4391
119790001,8580013,1800

 

 

 

 

Regards,

Suresh

Re: Oryx : Collaborative filetring/Recommendation

Master Collaborator

That looks just fine to me. And you verified there are the number of distinct values in these files that you expect? I usually do something like

 

cut -d, -f1 | sort -u | wc -l

 

What happens with a fresh run, just to rule out accidentally using old data/config?

 

If it's still the same number, could I have a copy of your data file (offline is fine -- sowen at cloudera)

Don't have an account?
Coming from Hortonworks? Activate your account here